What is recovery and atomicity in DBMS

9.1 Requirements for DB management systems

Transcript

1 Lecture WS 1999 / Requirements for DB Management Systems Learning objectives: Design factors Functionality Architectures Embedding in operating systems Requirements and architectures Primary design factors PQRI requirements (persistence, quantity, reactivity, integrity) Mandatory database services, database core services and extended database services have already been distinguished in Chapter 1. Differentiate DBMS Mac Filemaker, MS Access 1.0 vs. Unix Oracle, IBM IMS Important: Scalability through layering! In recent years there has been an increasing number of aspects of distribution: application clients to DBMS server (s), DBMS to DBMS peer-to-peer, Internet users to DBMS, ... requirements and architectures

2 Lecture WS 1999/2000 Further design factors secondary: language interfaces (cursor, embedded SQL, persistent objects, ...) and semantic objects (tree, relation, persistent heap) of the concrete data models (HDM, NDM, RDM, OODM). Tertiary: Basic machine Device interface: cylinders & blocks (raw devices in Unix) Single-user file interface (FAT, HPFS, NFS, ...) Transactional file access via operating system (tandem, VMS) Requirements and architectures Functionality of DB management systems (1) 1. Data manipulation (DML) 2. Data definition (DDL) 3. Data integrity and security The databases of a DB are often the most valuable resources of a company. The users of a DB are often not computer scientists. A database usually has many users. Database users often work at the same time. Measures: Database backup and error recovery (recovery) Data protection, access rights (authentication, authorization) Transactions with the properties of atomicity, consistency, isolation, durability Chapter 8 Requirements and architectures

3 Lecture WS 1999/2000 Functionality of DB Management Systems (2) 4. Parallel operation of synchronization measures (isolated transactions, cooperating workflows) Blockade handling (deadlock detection) 5. Performance control The different requirements of a DBMS are often in contradiction. Example: data volume buffer size response time essential performance-determining characteristics of a DBMS must therefore be able to be changed dynamically: (database administrator, DBA) access paths, buffer sizes backup and synchronization procedures data and transaction distribution in DBMSs requirements and architectures functionality of DB management systems (3) 6 Different users A DBMS has to efficiently support very different user classes: Database administrators Application programmers and their programming systems End users and their tools Comment: Changes in requirements in recent years: Permanently highly qualified personnel is becoming more and more expensive Hardware is becoming ever cheaper Cost of ownership is becoming more important than system efficiency Popularity of end user tools increases rapidly (MS-Access, Filemaker, ...) requirements and architectures

4 Lecture WS 1999/2000 Architectural Proposals for DBMS The discussed procedure for the design of large systems naturally leaves the designer with a lot of freedom. Have certain DBMS architectures prevailed in practice? Is there some kind of standard? Can such a standard for "DBMS in general" be expected at all? How far is a DBMS architecture determined by the individual requirements? Examples of DBMS architectures: "Strawman" architecture from CCA / NBS "Three-level" architecture from ANSI / SPARC "Five-layer model" from Senko (historically relevant) (conceptually relevant) (technically still relevant) Similarities: Generally applicable Data Independence Requirements and Architectures "STRAWMAN" architecture (1) Objectives: (Commissioned by the National Bureau of Standards for study delivered to the Computer Corporation of America, 1982) Standardization of interfaces for end users and application programmers. Increasing the portability of applications. Standardization of internal interfaces. Interchangeability of and free competition for DBMS components. System configuration by "plugging together" of components "Complete" recording of the DBMS functionality The interfaces are described on three levels of abstraction: Low: Inclusion of the special host machine and the special database model Medium: Abstraction from the special host machine, but inclusion of the special database model High: Abstraction from the special host machine and the special database model requirements and architectures

5 Lecture WS 1999/2000 "STRAWMAN" - Architecture (2) Example: control data processing (metadata in the data dictionary) E schema subschema access rights E consistency condition. Database administration aids End user operation Core database administrator Storage scheme Device allocation Log. Performance monitoring E Phys. Performance monitoring E = End user requirements and architectures "STRAWMAN" architecture (3) Reasons for failure: (?) Purely academic access (79 articles and textbooks viewed) Over-generalization (valid for any data model) No focus and standardization through an implementation No market for open database systems (1981, 1997?) Too many cross-interfaces make interchangeability of system components, requirements and architectures difficult

6 Lecture WS 1999/2000 ANSI / SPARC architecture (1) Goal: Independence between data and applications Support of multiple user views Literature: Tsichritzis, Klug, 1978 Three layers: Internal layer Conceptual layer External layer Definition of different schemas in the individual layers Storage of the schemas in the data dictionary (data dictionary, repository, ...) Requirements and architectures ANSI / SPARC architecture (2) The three-layer architecture External layer Conceptual layer External view 1 user 1 ... n ... External / conceptual mapping Conceptual Scheme Conceptual / internal mapping External view n Internal layer Internal scheme see: R. Elmasri, SB Navathe. Fundamentals of Database Systems. Benjamin / Cummings, Redwood City, California, requirements and architectures

7 Lecture WS 1999/2000 ANSI / SPARC architecture (3) External layer: Each external schema (user view: user = application program) describes the view of one or more users on the data. Data that is not relevant to the user is hidden from him. Example: The scheme of a project information system hides employees' salaries. Conceptual layer: The conceptual schema defines the structures of the conceptual view of the entire database for the entire user community (unification of all user views into one common view). Entities, data types, relationships and constraints are taken into account. The physical memory structures are hidden. Example: The scheme of a company information system collects all information about the employees. Requirements and architectures ANSI / SPARC architecture (4) Internal layer: Internal schema describes the physical storage structures of the database. Using a physical data model, details of data storage and access paths are described. Examples: Separate storage areas for permanent employees and working students B-Tree: Access to projects via the project number Comments: Internal scheme is increasingly hidden (see OODBMS) Support of several external schemes in OODBMS Still insufficient options for implementing limited relational conceptual views on NDM, HDM or also file system requirements and architectures

8 Lecture WS 1999/2000 ANSI / SPARC architecture (5) Data independence: Protection of the user of a DBMS from negative effects in the course of changes in the system environment (). Types of data independence: Logical data independence: The conceptual schema can be changed without any consequences for the external schema. Example: Extension of the database by a new class or combination of several classes through generalization in the conceptual schema. Physical data independence: The internal schema can be changed independently of the conceptual schema without any functional changes occurring in the applications. Example: Reorganizing the data or setting up new access paths. Requirements and architectures Five-tier architecture by Senko (1) Strong emphasis on data independence (DIAM, Data Independent Access Model) (cf. M. Senko, 1973) Increased adaptability (new SQL standards, new host languages, ...) Better Transferability between operating systems (Unix, VMS, MVS, OS / 2, MSDOS, ...) Use and further development in System / R (approx. 1980) as well as by Härder and Reuter (see database manual) Five layers = six interfaces! Note: The naming of the layers and the objects at the interfaces is very inconsistent in the literature. However, the concepts are very similar! Requirements and Architectures

9 Lecture WS 1999/2000 Five-layer architecture by Senko (2) Six interfaces Set-oriented interface: It supports the sets and predicates of higher database languages ​​(SQL, OQL, DBPL) and is not used for the network or hierarchy model. Record-oriented interface: It offers logical access paths to individual records, such as navigating access in the NDM, and can manage transactions and monitor integrity requirements. Internal record interface: It no longer has any relation to the specific database model and decides on the allocation of records and record fields mainly from an efficiency point of view. Access methods (by the DBA) can be specified on it. Segment interface: It offers a homogeneous, linear, "infinite", virtual address space (in pages or segments). Access to these storage units is synchronized and can be assumed to be robust (error recovery). File interface: It provides efficient block-oriented file access and is usually implemented by the host computer's operating system. Device interface: It is provided directly by hardware with suitable characteristics (direct access to persistent data). Requirements and architectures Five-layer architecture by Senko (3) S5: Implementation of the set-oriented DB interface Set-oriented interface (ODBC, SQL, OQL, ...) Record-oriented interface S4: Implementation of the set-oriented DB interface Internal record interface S3: Realization of Storage structures Segment interface S2: System buffer management S1: External memory management File interface (e.g. UNIX, VMS, MSDOS, ...) Device interface (e.g. IDE, SCSI, ...) Requirements and architectures

10 Lecture WS 1999/2000 Five-layer architecture by Senko (4) Semantically richer objects S5: Implementation of the set-oriented DB interface More application-oriented operations S4: Implementation of the set-oriented DB interface Improved integrity S3: Realization of increased storage structures Error tolerance S2: System buffer management S1 : External storage management Quantity-oriented interface (ODBC, SQL, OQL, ...) Record-oriented interface Internal record interface Segment interface File interface (e.g. UNIX, VMS, MSDOS, ...) User data described by metadata Device interface (e.g. IDE, SCSI, ...) Requirements and Architectures Embedding a DBMS in the operating system (1) Database clients: As a rule (exception: "old terminal applications") there is one operating system process for each active application program ("client process"). The database client has local application and DB session data (cursor, private buffer, ...) and communicates with a separate DBMS process ("server process"). Separating the address spaces ensures the integrity of the database, provides a simple option for remote database access, but also makes it necessary to copy data between processes. A client can interact with several DB servers (possibly from different manufacturers) at the same time. Problematic: Element-wise data exchange via the address space of the client Transactional processing beyond database boundaries Many database manufacturers therefore offer gateways to other database systems. Requirements and Architectures

11 Lecture WS 1999/2000 Embedding a DBMS in the operating system (2) Database server: single server approach A single central DBMS operating system process manages numerous logical work contexts, one for each active transaction. In a client-server environment, many clients communicate with this server, which implicitly takes care of the synchronization between the local data of these clients. In reality, the central DBMS process consists rather of a fixed number of functionally specialized, closely coupled processes (lock manager, archive manager, application manager, ...) Database server: Multi-server approach On multi-processor machines or In tightly coupled clusters, it can make sense to have several functionally equivalent DBMS operating system processes at the same time that offer access to the same database. The implementation of multi-servers is considerably more complex, since complex synchronization and error recovery tasks have to be solved in order to maintain the illusion of a central, consistent DBMS. Requirements and Architectures