Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Unit 1 Introduction to DBMS (Database Management Systems) Application program DBMS End-user 1-1 Outline of Unit 1 1.1 Information Systems 1.2 An Overview of a Database System 1.3 An Architecture for a Database System 1.4 Database Technology Trends Wei-Pang Yang, Information Management, NDHU 1-2 1.1 Information Systems 1-3 Stages of Information System Stage 0: Manual Information System • Records • Files • Index Cards Stage 1: Sequential Information Systems • Tapes • Files • slow, non-interactive, redundancy,... . Stage 2: File Based Information Systems • Disk (direct access) • application program has its own file • data redundancy data dependence Stage 3: DBMS based Information Systems • Generalized data management software • Transaction processing Wei-Pang Yang, Information Management, NDHU 1-4 File Based Information Systems Conventional Data Processing techniques: Enterprise: Application System A Application System B Application Program A Application Program B File System A Customer File System B Invoice File System N Inventory … Application System N Wei-Pang Yang, Information Management, NDHU Application Program N 1-5 File Based Information Systems (cont.) Customer Customer No. Customer Name Customer Addr. Social Security ID Inventory Part No. Part Description Unit Price Supplier Quantities Remain Quantities Ordered Wei-Pang Yang, Information Management, NDHU Invoice Customer No. Customer Name Part No. Quantities Unit Price Parts Part No. Part Description Supplier Quantities Ordered Customer Name Unit Price 1-6 User A1 User A2 User B1 User B2 User B3 Host Language + DSL Host Language + DSL Host Language + DSL Host Language + DSL Host Language + DSL C, Pascal DSL (Data Sub Language) e.g. SQL 1 2 3 External View @ # & External schema External schema A B External/conceptual mapping A Conceptual schema < External View B External/conceptual mapping B Conceptual View Database management system dictionary (DBMS) e.g. system catalog Conceptual/internal mapping DBA (Build and maintain schemas and mappings) Storage structure definition (Internal schema) Wei-Pang Yang, Information Management, NDHU 1 2 3 Stored database (Internal View) # ... 100 & @ 1-7 DBMS based Information Systems: Basic Approach - Integration (1) Integration of Information • Description of the integrated view of data is the "Conceptual Schema" of the database DBMS Application program Wei-Pang Yang, Information Management, NDHU End-user 1-8 DBMS based Information Systems: Basic Approach – Simple views and High level language (2) Provide simple views (External Schema) and high level language (e.g. SQL) for users to manipulate (handle) data • High level language: e.g. SQL (Structured Query Language) <e.g.>: SELECT SNAME FROM S WHERE S#= 'S4'; • Description of user's view of data is the "external schema" or "subschema" or "view". • High-level languages (Query Language): SQL (1) Data Definition Language: define format (2) Data Manipulation Language: retrieve, insert, delete, update • Emphasize: EASE OF USE !! Wei-Pang Yang, Information Management, NDHU S S# name 1-9 DBMS based Information Systems: Basic Approach - Storage/Access Method (3) Efficient Storage/Access Techniques: • implemented once rather than duplicated in all application programs. User: query in SQL Language Processor DBMS Access Methods Calls Access Method (B+ tree, Dynamic Hashing) I/O calls Wei-Pang Yang, Information Management, NDHU 1-10 DBMS based Information Systems: Basic Approach - Transaction Management (4) Provide Transaction Management: • Concurrency Control • Recovery • Security • . : Wei-Pang Yang, Information Management, NDHU 1-11 Example: A Simple Query Processing Query in SQL: SELECT CUSTOMER. NAME FROM CUSTOMER, INVOICE WHERE REGION = 'N.Y.' AND AMOUNT > 10000 AND CUTOMER.C#=INVOICE.C DBMS Language Processor Internal Form : ( (S SP) Optimizer Operator : SCAN C using region index, create C SCAN I using amount index, create I SORT C?and I?on C# JOIN C?and I?on C# EXTRACT name field Language Processor Operator Processor Calls to Access Method: OPEN SCAN on C with region index GET next tuple . . . Calls to file system: GET10th to 25th bytes from block #6 of file #5 Access Method e.g.B-tree; Index; Hashing Access Method File System 1-12 Wei-Pang Yang, Information Management, NDHU database 1.2 An Overview of a Database System 1-13 An Example The Wine Cellar Database: Cellar: Bin 2 3 6 12 16 43 50 51 Wine Chardonnay Chardonnay Chardonnay Jo. Riesling Jo. Riesling Cab. Sauvignon Pinot Noir Pinot Noir Producer Year Bottle Ready Buena Vista 83 1 85 Louis Martini 81 5 84 Chappellet 82 4 85 Buena Vista 82 1 83 Sattui 82 1 83 Robt. Mondavi 77 12 87 Mirassou 77 3 85 Ch. St. Jean 78 2 86 Comments Thanksgiving Late Harvest Very dry Harvest Retrieval: • DML (Data Manipulation Language): SELECT Wine, Bin, Producer FROM Cellar WHERE Ready = 85; • Result: Wine Bin Chardonnay 2 Chardonnay 6 Pinot Noir 50 Wei-Pang Yang, Information Management, NDHU Producer Buena Vista Chappellet Mirassou 1-14 An Example (cont.) Deletion: • DML: DELETE FROM Cellar WHERE Ready < 86; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir Producer Robt. Mondavi Ch. St. Jean Year Bottle Ready Comments 77 78 12 2 87 86 Insertion: • DML: INSERT INTO Cellar VALUES (53, 'Pinot Noir', 'Franciscan', 79, 1, 86, 'for Joan'); • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Wei-Pang Yang, Information Management, NDHU Producer Robt. Mondavi Ch. St. Jean Franciscan Year Bottle Ready 77 78 79 12 2 1 87 86 86 Comments for Joan 1-15 An Example (cont.) Update • DML: UPDATE Cellar SET Bottles = 4 WHERE Bin = 51; • Result: Bin Wine 43 Cab. Sauvignon 51 Pinot Noir 53 Pinot Noir Wei-Pang Yang, Information Management, NDHU Producer Robt. Mondavi Ch. St. Jean Franciscan Year 77 78 79 Bottle Ready Comments 12 4 1 87 86 86 for Joan 1-16 What is a Database System? DBMS Application program End-user Wei-Pang Yang, Information Management, NDHU 1-17 What is a Database System? (cont.) Major components of a database system: • • • • Data: integrated and shared. Hardware: disk, CPU, Main Memory, ... Software: DBMS Users: 1. Application programmers 2. End users 3. Database administrator (DBA) • Defining external schema • Defining conceptual schema • Defining internal schema • Liaison with users • Defining security and integrity checks • Defining backup and recovery procedures • Monitoring performance and changing requirements Wei-Pang Yang, Information Management, NDHU 1-18 Why Database ? Redundancy can be reduced Inconsistency can be avoided The data can be shared Standards can be enforced Security restrictions can be applied Integrity can be maintained Provision of data independence objective ! Wei-Pang Yang, Information Management, NDHU 1-19 Data Independence Application Program Data Structure Immunity of application to change in storage structure and access strategy. Wei-Pang Yang, Information Management, NDHU 1-20 Data Dependence vs. Data Independence S Data Dependent S# e.g. SELECT CITY S1 S2 : Sn FROM S WHERE ITEM = 'X'; • Linked list: S Top TOP s1 sn s2 if item = TOP . item then ......... • Tree: . if item < root.data then root := root .left .......... • Array: if A[I] = item then ............ • Storage structure changed program changed Wei-Pang Yang, Information Management, NDHU 1-21 1.3 An Architecture for a Database System 1-22 User A1 User A2 User B1 User B2 User B3 Host Language + DSL Host Language + DSL Host Language + DSL Host Language + DSL Host Language + DSL (Data Sub Language) e.g. SQL, QUEL External View @ #A & External schema A External/conceptual mapping A Conceptual schema < External View B External/conceptual mapping B Conceptual @ View#A & Database management system (DBMS) Conceptual/internal mapping DBA (Build and maintain schemas and mappings) External schema B Storage structure definition (Internal schema) Stored database (Internal View) Wei-Pang Yang, Information Management, NDHU # & @ 1-23 An Example of the Three Levels Internal level: STORED_EMP length = 18 PREFIX TYPE = BYTE(6), OFFSET = 0, INDEX = EMPX EMP# TYPE = BYTE(6), OFFSET=0, DEPT# TYPE = BYTE(4), OFFSET = 12 PAY TYPE = FULLWORD, OFFSET = 16 Conceptual level: EMPLOYEE EMPLOYEE_NUMBER DEPARTMENT_NUMBER SALARY CHARACTER(6) CHARACTER(4) NUMERIC(5) External level: • ( PL /I ) DCL 1 EMP 2 EMP# 2 SAL CHAR(6) FIXED BIN(31) • (COBOL) 01 EMPC 02 EMPNO PIC X(4) 02 DEPTNO PIC X(4) Wei-Pang Yang, Information Management, NDHU 1-24 Functions of the DBMS Data Definition Language (DDL) Data Manipulation Language (DML) Data Security and Integrity Data Recovery and Concurrency Data Dictionary Performance Wei-Pang Yang, Information Management, NDHU 1-25 1.4 Database Technology Trends 1960s to Mid-1970s 1970s to Mid-1980s Late 1980s Future Merging data models, knowledge representation, and programming languages Network Hierarchical Relational Semantic Object-oriented Logic Database Hardware Mainframes Mainframes Minis PCs Faster PCs Workstations Database machines User Interface None Forms Query languages - SQL, QUEL Graphics Menus Query-by-forms Natural language Speech input Program Interface Procedural 4GL Logic programming Integrated database and programming language Presentation and display processing Reports Processing data Business graphics Image output Knowledge processing Generalized display managers Distributed knowledge processing Data Model Wei-Pang Yang, Information Management, NDHU Embedded query language Report generators Information and transaction processing Parallel processing Optical memories 1-26 Distributed Databases Database Distributed System Distributed database is a database that is not stored in its entirety at a single physical location, but rather is spread across a network of computer. < e.g.> Hsinchu computers Taipei Kaoshiung communication links Taichung Wei-Pang Yang, Information Management, NDHU 1-27 Distributed Databases (cont.) Advantages: efficiency of local processing data sharing Disadvantages: communication overhead implementation difficulties Reference: S. Ceri and G. Pelagatti "Distributed Databases: principles and systems" Wei-Pang Yang, Information Management, NDHU 1-28 Multi-Database/Heterogeneous Database Multidatabase IMS Hierarchical Model • semantic inconsistency INGRES Relational Model .. . ORION ObjectOriented Model • data incompleteness • global schema Wei-Pang Yang, Information Management, NDHU 1-29 DB + AI Query Database AI DBMS Language Processor Query Optimizer Knowledge Base Operator Processor Access Method File Manager Distributed DB design Knowledge Base Wei-Pang Yang, Information Management, NDHU Logical DB design Knowledge Base 1-30 KBMS Database Logic A Combined Model : Logic Programming + Relational DB Three layers : User Program Knowledge management program IDB relational interface Relational DB management program EDB Wei-Pang Yang, Information Management, NDHU Query : ? :- ancestor (taro, Y) ? :- grandfather (?, c) IDB: ancestor(X,Y):- parent(X,Y) ancestor(X,Y) :- parent(X,Z), ancestor(Z,Y) parent(X,Y):-edb(father(X,Y)) parent(X,Y):-edb(mother(X,Y)) grandfather(X,Z):- father(X,Y) ^ father(Y,Z) EDB: fathe r father son A B X Y . . . B mother . C 1-31 Database Object-Oriented OODB A typical Document : MEMO [Woelk86, SIGMOD] MCC To: W. Kim From: D. Woolk Date: September 18, 1992 Subject: Workstations can be heard In the computer center of National Chiao-Tung University, there are a lot of workstations. There are HP RS serials, SUNs, Apollo, and so on. The students in NCTU learn to use workstation since they are freshmen. The configuration of the workstations follows: Workstation . speaker voice message associated . . Workstation } text } } graphics Database Server In the course introduction to Computer Science? students do their homework's on workstations. image Wei-Pang Yang, Information Management, NDHU 1-32 Use of a Database Management System in Design and Application Design Release Control Detailed Design Database Manufacturing Fabrication Assembly Info Analysis Models Preliminarydesign INTERFACE DBMS Database Management System Graphic Interface Test / Inspection Language Interface APPLICATION Design Analysis Design Verification Manufacturing Planning Wei-Pang Yang, Information Management, NDHU Evaluation Production Control Synthesis Release Project Management 1-33 Fuzzy Database Database Fuzzy Set Theory Fuzzy Query <e.g.> SELECT FROM WHERE STUDENT.NAME STUDENT SEX = M AND HEIGH = TALLER AND WEIGH = SLIMMER STUDENT: NAME SEX HEIGHT WEIGHT IQ Mary Linda . . . F F . . . <e.g.> Wei-Pang Yang, Information Management, NDHU 158 165 . . . SELECT FROM WHERE 55 55 . . . High Medium . . . STUDENT.NAME STUDENT IQ >= 130 1-34 More? AI DB 1993 DB ? ? 2001 VR DB ? 1995 DB 1997 1999 ? 2004 ??? SARS ? DB Mobile/video ? DNA/BioInfo . DB WWW DB Wei-Pang Yang, Information Management, NDHU DB 2003 ? 911/ Anthrax 2006 ??? ? 1-35