Database Systems Breaking Out of the Box - PowerPoint PPT Presentation

database systems breaking out of the box l.
Skip this Video
Loading SlideShow in 5 Seconds..
Database Systems Breaking Out of the Box PowerPoint Presentation
Database Systems Breaking Out of the Box

play fullscreen
1 / 35
Download
Download Presentation

Database Systems Breaking Out of the Box

Presentation Transcript

  1. Database Systems “Breaking Out of the Box” Avi Silberschatz Stan Zdonik Bell Laboratories Brown University July 7, 1997 Mehmet Uner

  2. The Paper’s Theme (Strategic Directions) • Database Research should be devoted to the problems of data management no matter where and in what form the data might be found. • Database management skills should be applied to new data management environments that potentially require radically new software architectures. Mehmet Uner

  3. Outline • Introduction • Background • Our Skills • Scenarios • Barriers • Research • Conclusions • References Mehmet Uner

  4. Introduction • The field of database systems research and development has been very successful over its 30 year history. • It has led to $10 billion industry that touches virtually every major company in the world. • Unthinkable to manage large volume of valuable information that keeps corporations runing without support from commercial database management systems (DBMS). • DBMS is a very complex system incorporating a rich set of technologies. • Suited for solving problems of large-scale data management in the corporate setting. Mehmet Uner

  5. DBMS DBMS Requirements: • Execution Overhead. • High level of expertise to install and maintain. • Only manages data in fairly specific file formats. Mehmet Uner

  6. Solution At the same time: • Data is changing rapidly. • Data is stored in different places (e.g. files) • Data is obtained in large volumes from external sources like sensors. Solution: • Not full-blown DBMS, a lighter-weight solution • Instead of using an existing tool in a new application, it is better to embed reusable components. • Use database system components, techniques and experience in new ways. Mehmet Uner

  7. Examples • Some examples that could benefit from data management techniques but that typically do not make heavy use of database products: • World Wide Web • Personal Information Systems (e-mail) • News Services • Scientific Applications Mehmet Uner

  8. Background • Database field born with release of IMS in 60’s. • IBM Product • Managed data as hierarchies • Data has value, manage independently of application • Codasyl, most well known successor • Based on graph-based structure. • Ted Codd published a paper in 1970 • Suggested relational model. Mehmet Uner

  9. Background • Object Oriented Principles in 80’s • Allow users to create their own application-specific types that can be managed by the DBMS. • Hybrid model in 90’s • Embeds object-oriented features in a relational context. Mehmet Uner

  10. Our Skills • Database Management Systems have been concerned with the following problems: • High Performance • Correctness • Maintainability • Reliability • From point of view of slow-memory devices that must be shared by multiple concurrent users • This approach leads to a set of skills and techniques that can be applied and extended to other problems. Mehmet Uner

  11. Skills and Techniques • Data Modeling • Language for defining structure of database • Language for manipulating those structures. • Query Languages • High-level language to retrieve data from the database. (SQL) • Query Optimization and evaluation • State-based views • Restricted and reorganized view of database. Mehmet Uner

  12. Skills and Techniques • Data Management • Automatic maintenance of data structures • Efficient Movement of data • Transactions • A response to correctness problems introduced by concurrent access and update • Distributed Systems • Scalable Systems • Database systems have been tuned to efficiently and reliably handle data volumes that exceed the size of the the physical memory by several orders of magnitude. Mehmet Uner

  13. Scenarios • The way for future data management systems • The technology that would support these scenarios constitutes a research agenda for the next decade. 1) Instant Virtual Enterprise 2) Personal Information Systems Mehmet Uner

  14. Instant Virtual Enterprise • An “instant virtual enterprise” (IVE) is a group of companies, that do not routinely function as a unit. • Come together to respond to a customer order or request for proposal. • Computer integrated manufacturing (CIM) is an example of an environment requiring IVE cooperation. • Engineering side • Design, Production, Quality Assurance • Administrative side • Planning, Production Control, Resource Management Mehmet Uner

  15. Instant Virtual Enterprise • Companies in IVE needs to exchange and manage large amounts of data • Companies will have many heterogeneous databases • Sharing and exchanging data with coordinating information is critical Mehmet Uner

  16. Company A Company Q Company R Company S IVE Scenario Building an oil pipeline Engineering Firm (IVE) License their design Engineering Analysis Mehmet Uner

  17. Company T Company U Company V Company W IVE Scenario Actual Fabrication Casting Design file conversion service Documentation and Archiving Mehmet Uner

  18. IVE Scenario • Database Capabilities Needed: • Executing a query for the design • Data translation services for engineering analysis • Coordination and configuration management • Changes to an object in one subsystem require changes to one or more related objects in other subsystems. • Security and access control over the information • Archiving of information, even after the IVE disbands Mehmet Uner

  19. Personal Information Systems Scenario • Provides information to an individual • Uses PID (Personal Information Device) • PDA • Handheld PC • Laptop • Equipped with wireless network connection • Access to internet Anywhere, Anytime. Mehmet Uner

  20. Personal Information Systems Scenario • Tightly integrated with individual’s activities. From morning to bed time. • In the morning • Local Weather Report • List of Reminders • List of Morning Meetings • Best Route from home to work • Personalized Headlines • Personalized Investment Report Mehmet Uner

  21. Personal Information Systems Scenario • Throughout the day • Tasks for the day • List of customers to contact • Summary of breaking news • Best Driving Routes in the city • At the end of the day • Next day’s activities • Appointments Mehmet Uner

  22. Personal Information Systems Scenario • PID must continuosly query remote databases and monitor broadcast information • PID will magnify today’s client-server performance, scalibility and reliability problems • Where should data reside, PID or Server? Mehmet Uner

  23. Barriers • DBMS provides a tightly controlled and highly uniform environment • For the new applications, database functionality should be provided outside of the limits of a DBMS. • For the vision represented in the scenarios, a number of technical barriers must be removed. Mehmet Uner

  24. Barriers • Overhead • System requirements, expertise, planning, monetary cost • Builder of personalized newspaper service do not use DBMS because there is no need for many of the advanced features. • A subset of the traditional database services are needed by many new applications • Scale • Greater volume of data (petabytes) • Hundreds of servers, client population even larger Mehmet Uner

  25. Barriers • Schema Organization • First create a schema to describe the structure of the database and populate the database • Many applications currently create data independently of a database system. (scientific applications, web sites) • Schema is incomplete or inconsistent. • Schema management facilities is needed to adapt the dynamic nature of foreign data. • Data Quality • Information accessed form a WAN may be of varying quality. • Future information systems must be able to react to the quality of the data source. Mehmet Uner

  26. Barriers • Heterogeneity • Data exists in many forms • These dissimilar formats must be integrated to allow applications to access data in a high-level and uniform way • Query Complexity • Different characteristics in future environments • Conventional, minimize number of disk access • Future, minimize total “information bill” Mehmet Uner

  27. Barriers • Ease of Use • Highly-trained, full-time staff is assumed to manage a DBMS • Yet most users have no training in database tech. • Simple set of interfaces needed. • Security • As the amount of shared information grows, the need to restrict access to specific users of for specific use arises. Mehmet Uner

  28. Barriers • Guaranting Acceptable Outcomes • Transacation managemnet, a barrier to both system performance and ability to specify acceptable outcomes • New or enchanced transaction technology is needed • Making data unavaliable is not acceptable • Aborting transactions is unacceptable • Technology Transfer • Barrier between research and industry • Insufficient knowledge of each other Mehmet Uner

  29. Research • In order to achieve the vision and overcome these barriers, a number of central research topics must be addressed: • Extensibility and Componentization • Imprecise Results • Schemaless Databases • Ease-of Use • New transaction Model • Query Optimization • Data Movement • Security • Database Mining Mehmet Uner

  30. Research • Extensibility and Componentization • DBMS in a modular way • Lighter-weight applications • Imprecise Results • In the web search engines do not provide 100% accuracy • A general theory of imprecision must be developed • Schemaless Databases • Able to work with unstructured data Mehmet Uner

  31. Research • Ease-of-use • Better database interfaces are required. • New transaction Models • Overcome blocking. • Provides Correctness. • Query Optimization • New indexing methods, query processing strategies. • Cheaper but slower response time. • Sensitive to bandwidth and power considerations. Mehmet Uner

  32. Research • Data Movement • In a distributed environment, the cost of moving data can be extremely high • Asymmetric communication channels, (low bandwidth lines) • Security • Formulation of an authorization model • Interoperability between differen security policies • Database Mining • Machine Learning • Statistical Analysis • Database Technologies Mehmet Uner

  33. Conclusions • Database research must be broadly defined. • Database community must apply its experience and expertise to new areas and new solution packet must be found. • The vision is an integration that supports the application of database functionality in small modules that give just the right capability. • These modules should also represent a unified theory of information that allows for the querying information of all types without having to switch languages or paradigms. Mehmet Uner

  34. References • E. F. Codd, “A relational Model for Large Shared Databanks”, Communications of the ACM, 13:6,(June 1970), pp. 377-387. • J. Gray,http://www.cs.washington.edu/homes/lazowska/cra/database.html • A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities,” SIGMOD Record, 19:4, pp.6-22. • A. Silberschatz, M. Stonebraker, and J. Ullman, “Database Systems: Achievements and Opportunities Into the 21st Century”, http://www.cs.stanford.edu/pub/papers/lagii.ps • J. Toole and P. Young, http://www.hpcc.gov/cic/forum/CIC_Cover.html Mehmet Uner

  35. Thanks! Any Questions? Mehmet Uner