* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Document
Survey
Document related concepts
Transcript
A Metadata Architecture For Enterprise-Wide Data Sharing 1 Department of Defense (DoD) Data Interoperability Challenge Same Data Requirements Different Functional Needs, Same Descriptions, Different Names Logistics Components/ Services Transportation Procurement Personnel Command & Control Finance Medical 2 DoD TARGET DATA SHARING ENVIRONMENT A Logistics Perspective SAUDI ARABIA QATAR KUWAIT UAE MEXICO Legend: AMMRL IMRL NAOMIS LIF ATAC DSS JAPAN NSIPS CC SS FRANCE NALDA II AWRDS APS UK U2 SAILS CANADA CRIM SAMS ISRAEL CAIMS SARSS OAS FIMAR SAAS-MOD ASIAN CASEMIS ROAMS SEATO TAPDB NATO TAMMIS AMMIS ATAV (LIDB) ARMS WARS PMIS ATEMS Essential AIS ATAC MRP II MC-TFS MAARS II MPS (BIC) SCS ATAC MANPERS SBSS (personnel?) CAS A SCS GO81 AFEMS DO35 SASSY USMC USN ARAMIS CMIS SUPPLY MGMT SNAP/SUADPS/FIMAR FUELS - NEURS USAF COALITION Contributing AIS CAMS MIMMS ATLAS II CAIMS USA MILSEALIFT SCCR MUFFIN CAIMS SPS/SDW GSA USCG SAMMS FLIS JECPO URD DAAS LOTS DVD Prime Vendors DoD Joint Applications DLA TAMMIS MEDSUP DOD CAV UDR VMI Freight Links NAV NAC (IDEs) COP CSE MEDSILS Commercial GCCS & GCSS TAMMIS MEDASM DISA MEDLOG DBSS JMAR GCCS - C2 IC3 TRANSCOM GTN MTMC AMS GATES CMOS GDSS TCACCIS UNCLAS WPS G081 PENTAGON BROKER IBS GOPAX CFM ADANS DITTS ISSE GUARD JALIS RFT-E TCAIMS II RFT-K CLAS PENTAGON DARPA JL ACTD ALP UNCLAS PACOM Server UNCLAS USFK JTAV Server ISSE GUARD CLAS PACOM CLAS USFK JTAV UNCLAS CENTCOM UNCLAS JFCOM Servers UNCLAS EUCOM Server (SOUTHCOM SOCOM) TAPE TAPE TAPE CLAS JFCOM CLAS EUCOM CLAS CENTCOM Server GCCS = Global Command & Control System GCSS = Global Combat Support System 3 The DDDS: The Current DoD Repository of DoD Standardized Data Elements METADATA REPOSITORY Defense Data Dictionary System (DoD Standardized Data Elements {SDE}) 17000+ SDEs Intended to be the DoD Repository of Data Elements to Support DoD Enterprise-wide Interoperable Data Sharing 4 PART I: The Problem 5 Current Problems 1. Incorrect data architecture abstraction level for representing Enterprise Level Data Elements for interoperable data sharing 2. Numerous redundant representations of Standardized Data Elements (SDEs) (DIFFERENT NAMES – SAME DEFINITION) 3. Incomplete, non-existent and/or, non-current SDE metadata 4. Inadequate categories of SDE metadata 5. Inadequate support / enforcement of data administration processes for data management 6 A Data Element Name and Data Element Definition Refresher: Two Data Element Metadata Attributes • Data Element Name is a label given to establish Data Element identity • Data Element Definition is a description providing complete, unambiguous meaning represented by a Data Element • Name and Definition together provide the semantic context for the • data item values represented by a Data Element Some key concepts/principles/facts about Data Element Naming and Definition: o NAME and DEFINITION are inseparable o NAME is a unique identifier for DEFINITION o A NAME can be viewed as a kind of very short DEFINITION o In order of precedence, DEFINITION creation should always precede NAME creation o Impossible to correctly NAME a data element with precision without a DEFINITION o NAME and DEFINITION are at the core of the data integration / sharing process o Many data sharing issues arise from “bad” data element NAMING and DEFINITION practices 7 A Proposed Metadata Architecture for Shared Enterprise Data Elements • Focuses on solutions for: o o o Problem 1 – Incorrect data architecture abstraction level Problem 2 – Differently named data elements for the same data element concept Problem 4 - Inadequate categories of standardized data element metadata • Not a solution for every kind of impediment to interoperable data sharing • Problems 3 and 5 require quality improvements in process execution and in data management governance and will be addressed in Part II. Will begin by looking more closely at the fundamental metadata architecture levels……. 8 Design Layers of a Business Information System Database Architecture • Specified Context Data Model Layer Roughly analogous to high level conceptual Entity-Relationship (E-R) data models of functional area/business domain, or Community of Interest (COI) data depicting the structure and relationships of entities and their attributes. • Implemented Technology Data Model Layer Database schemas represented in a particular technology (SQL, COBOL, etc) based on fully attributed 3rd normal form logical data models • Operational (Vendor) DBMS Data Model Layer Roughly analogous to a physical data model representing a particular vendor’s version of a technology based schema, i.e., Oracle SQL DBMS vs IBM DB2 SQL DBMS vs Sybase SQL DBMS, etc, etc • Business Application View Data Model Layer Represents the application system access interface to a DBMS which preserves the separation and integrity of the database data from systems that operate on and manipulate the data 9 Design Layers for a Business Information System Database Architecture Specified Context (Community of Interest) Data Model Layer “PERSONNEL” FUNCTIONAL AREA DEPENDENT DATA MODEL TEMPLATES “LOGISTICS” FUNCTIONAL AREA DEPENDENT DATA MODEL TEMPLATES “FINANCIAL” FUNCTIONAL AREA DEPENDENT DATA MODEL TEMPLATES “SPECIFIED” DATA MODEL LAYER (DoD FDAd Domain) ENTITY SUBJECT SUBJECT ENTITY ATTRIBUTE ATTRIBUTE Layer 1 SUBJECT ENTITY ATTRIBUTE • May be represented in one, a combination of, or all of the following views: Entity w/attributes; no key designations; un-normalized; unresolved many – to – many relationships o Key based entities; un-normalized; un-resolved many – to – many relationships o Fully attributed; un-normalized; resolved or un-resolved many – to – many relationships o Fully attributed; 3rd normal form; developed sub-typing; resolved many – to manys • Current DDDS / DDA functional / subject area domains map to domains represented by designated DoD Functional Data Administrators (FDAds): o Logistics DUSD(L) o Personnel USD(P&R) o Comptroller USD(C) o Health Affairs ASD(HA) o Etc, etc • Each DoD “Subject Area” DAd should have a “specified” data model that represents the data element structures and relationships of functional area data element requirements for their respective functional area or community of interest. o 10 Design Layers for a Business Information System Database Architecture Implemented Technology Data Model Layer FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES “SPECIFIED” DATA MODEL LAYER (DoD COI DAd Domain) SUBJECT ENTITY ATTRIBUTE TECHNOLGY DEPENDENT (e.g.,IDMS,) FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA COLUMN TABLE TECHNOLGY DEPENDENT (e.g.,COBOL,) FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA Layer 1 COLUMN TABLE TECHNOLGY DEPENDENT (e.g.,SQL,) FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA TABLE COLUMN “IMPLEMENTED” DATA MODEL LAYER (Data Architect / Modelers Domain) Layer 2 • 3RD Normal form ERD logical data model • Represented in a technology dependent data architecture schema • Technology driven / constrained data element naming • Subject area entity and attribute templates are deployed into schema tables and columns that must conform to a particular chosen technology such as COBOL or SQL. • Layer 1 attribute metadata is inherited by Layer 2 columns • The relationship between Layer 1 and Layer 2 is one – to – many. That is to say that any attribute from Layer 1 may be deployed as a column into many Layer 2 schemas. 11 Design Layers for a Business Information System Database Architecture Operational Vendor DBMS Data Model Layer FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES ENTITY SUBJECT Layer 1 ATTRIBUTE “SPECIFIED” DATA MODEL LAYER (DoD COI DAd Domain) TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA Layer 2 COLUMN TABLE “IMPLEMENTED” DATA MODEL LAYER (Data Architect / Modelers Domain) DBMS DEPENDENT (e.g., Sybase) DBMS DATA MODEL DBMS DEPENDENT (e.g., DB2) DBMS DATA MODEL DBMS DEPENDENT (e.g., Oracle) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) DBMS SCHEMA DBMS SCHEMA DBMS SCHEMA DBMS TABLE DBMS TABLE DBMS TABLE DBMS COLUMN DBMS COLUMN DBMS COLUMN Layer 3 • Roughly analogous to a physical data model • Vendor’s versions of particular technology based schema such as SQL, i.e., Oracle SQL DBMS vs Informix SQL DBMS, vs Sybase SQL DBMS, etc, etc. • Data element naming is bound by vendor’s implemented DBMS business rules for a particular technology based schema. • Again, the relationship between Layer 2 and Layer 3 is one – to – many. That is to say that any column from a Layer 2 schema may be deployed as a DBMS column in many Layer 3 DBMSs. 12 Design Layers for a Business Information System Database Architecture Business Application View Data Model Layer FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES SUBJECT ENTITY ATTRIBUTE “SPECIFIED” DATA DATA MODEL LAYER (DoD COI DAd Domain) Layer 1 TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA TABLE COLUMN “IMPLEMENTED” DATA MODEL LAYER (Data Architect / Modelers Domain) DBMS DEPENDENT (Oracle, DB2, Sybase, etc) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) DBMS SCHEMA DBMS TABLE DBMS COLUMN Layer 2 Layer 3 BUSINESS APPLICATION SYSTEM VIEW DATA MODEL (Command & Control) BUSINESS APPLICATION APPLICATION INFORMATION VIEW TABLE VIEW COLUMN BUSINESS APPLICATION SYSTEM SYSTEM BUSINESS “VIEW” DATA MODEL (Personnel App) APPLICATION APPLICATION INFORMATION VIEW TABLE VIEW COLUMN BUSINESS APPLICATION SYSTEM SYSTEM BUSINESS “VIEW” DATA MODEL (Logistics App) APPLICATION APPLICATION INFORMATION VIEW TABLE VIEW COLUMN SYSTEM (Domain of Application System Managers (SMs and/or PMs) Layer 4 • Data element naming in conformance with functional area common business language terms • Finally, with respect to a single DBMS, the relationship between Layers 3 and 4 is also one– to – many from 3 to 4. That is, a DBMS column may be deployed as view columns in many applications that may interface with a particular DBMS. 13 The Problem: Sourcing Enterprise “Context Independent” Data Element Standards from Enterprise “Context Dependent” Data and Information Systems and Databases Standalone Database #1 Standalone Database #2 Enterprise Common Data Element Concept “A” Enterprise Common Data Element Concept “A” Layer 1 Person Given Name Layer 2 Salesman First Name Layer 3 EFN Layer 4 Name Enterprise Registry of Standardized Shared Data Elements: The DDDS Layer 1 Sailor Given Name Layer 2 Sailor First Name Layer 3 Sail_Frst_Nm Enterprise Registry of Standardized Data Elements (SDE) To Represent Common Enterprise Data Element Concepts Layer 4 First Name Enterprise Data Element Concept “A” SDE “A”: Person Given Name SDE “A”: Employee First Name Standalone Database #4 Enterprise Common Data Element Concept “A” SDE “A”: Sail_Frst_Nm SDE “A”: Legal First Name Layer 1 Employee Given Name Layer 2 Employee First Name Layer 3 Emp_Gv_Nam Layer 4 Given Name Standalone Database #3 Enterprise Common Data Element Concept “A” Layer 1 Legal Given Name Effective Result: Four differently named versions, or, representations of the Enterprise common Data Element Concept, “A”, that will exist in the Registry as Standardized Data Elements. Redundancy and ambiguity is the consequence. Layer 2 Authoritative First Name Layer 3 Lg_Auth_Frst_Nm Layer 4 Legal First Name 14 Design Layers for a Business Information System Database Architecture Database System Architecture Design Steps The DDDS Repository Intended to represent DoD globally shared enterprise standard data elements. Thus, the repository should contain only one named data element standard for each unique enterprise level data element concept. • Specified Context Data Model Layer • Implemented Technology Data Model Layer • Operational Vendor DBMS Data Model Layer • Business Application View Data Model Layer FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES SUBJECT ENTITY ATTRIBUTE Layer 1 “SPECIFIED” DATA MODEL LAYER (DoD COI DAd Domain) TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL SCHEMA TABLE COLUMN “IMPLEMENTED” DATA MODELLAYER (Data Architect / Modelers Domain) DBMS DEPENDENT (Oracle, DB2, Sybase, etc) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) BUSINESS APPLICATION SYSTEM “VIEW” DATA MODEL (Domain of Application System Managers (SMs and/or PMs) DBMS SCHEMA BUSINESS INFORMATION SYSTEM DBMS TABLE APPLICATION VIEW TABLE DBMS COLUMN Layer 2 Layer 3 APPLICATION VIEW COLUMN METADATA REPOSITORY Defense Data Dictionary System (DoD Standardized Data Elements) 17000+ SDEs Layer 4 15 Design Layers for a Business Information System Database Architecture The DDDS Repository Intended to represent DoD globally shared enterprise standard data elements. Thus, the repository should contain only one named data element standard for each unique enterprise level data element concept. In reality, the repository contains many cases of differently named data elements that represent the same data element concept. The result is uncontrolled redundancy and ambiguity incapable of supporting seamless and interoperable data sharing. Database System Architecture Design Steps • Specified Context Data Model Layer • Implemented Technology Data Model Layer • Operational Vendor DBMS Data Model Layer • Application View Data Model Layer FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES Layer 1 SUBJECT ENTITY ATTRIBUTE “SPECIFIED” DATA MODEL LAYER (DoD COI DAd Domain) A source for DDDS SDEs TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL Layer 2 SCHEMA TABLE COLUMN A source for DDDS SDEs “IMPLEMENTED” DATA MODEL LAYER (Data Architect / Modelers Domain) DBMS DEPENDENT (Oracle, DB2, Sybase, etc) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) BUSINESS APPLICATION SYSTEM “VIEW” DATA MODEL (Domain of Application System Managers (SMs and/or PMs) Layer 3 DBMS SCHEMA BUSINESS INFORMATION SYSTEM DBMS TABLE DBMS COLUMN A source for DDDS SDEs METADATA REPOSITORY Defense Data Dictionary System One GIGANTIC semantic mess 17000+ SDEs Layer 4 APPLICATION VIEW TABLE APPLICATION VIEW COLUMN A source for DDDS SDEs 16 ….But, Where’s the Beef ?? The DDDS “Big Bun” .…the Enterprise Data Element “Beef” ?? 17 Design Layers for a Business Information System Data Architecture CONCEPT STRUCTURE TYPE Layer 0 DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE CONCEPT DATA ELEMENT CONCEPT STRUCTURE CONCEPTUAL VALUE DOMAIN DATA ELEMENT CONCEPT Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) DATA ELEMENT CONCEPTUAL VALUE DOMAIN STRUCTURE VALUE DOMAIN VALUE DOMAIN STRUCTURE CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION The Enterprise Data Element Layer • ISO 11179 Naming and Definition • Context Independent Data Elements o Uniform Naming o Uniform Semantics o Uniform Value Domains 18 Design Layers for a Business Information System Data Architecture Layer 0 CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPTUAL VALUE DOMAIN CONCEPT DATA ELEMENT CONCEPT STRUCTURE VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION DATA ELEMENT Layer 1 FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES SUBJECT ENTITY ATTRIBUTE “SPECIFIED” DATA MODEL LAYER (DoD FDAd Domain) DDDS Source TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL Layer 2 SCHEMA TABLE DBMS DEPENDENT (Oracle, DB2, Sybase, etc) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) BUSINESS APPLICATION SYSTEM “VIEW” DATA MODEL LAYER DDDS Source DDDS Source DBMS SCHEMA The DDDS: COLUMN “IMPLEMENTED” DATA MODEL LAYER (Data Architects / Modelers Domain) (Domain of Application System Managers (SMs and/or PMs) CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE VALUE DOMAIN DATA ELEMENT CONCEPT Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) CONCEPTUAL VALUE DOMAIN STRUCTURE DBMS TABLE One gigantic semantic messredundancies & ambiguities DBMS COLUMN 17000+ SDEs Layer 3 DDDS Source BUSINESS INFORMATION SYSTEM APPLICATION VIEW TABLE APPLICATION VIEW COLUMN Layer 4 19 META MODEL ARCHITECTURE SUPPORTING ENTERPRISE WIDE SHARED DATA CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPTUAL VALUE DOMAIN CONCEPT DATA ELEMENT CONCEPT DATA ELEMENT CONCEPT STRUCTURE Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) FUNCTIONALLY DEPENDENT & TECHNOLOGY INDEPENDENT DATA MODEL TEMPLATES ATTRIBUTE INHERITS DATA ELEMENT “SPECIFIED” DATA MODEL (DoD FDAd Domain) ENTITY SCHEMA CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE VALUE DOMAIN APPLICATION VIEWS OF DBMS TABLES & COLUMNS “VIEW” DATA MODEL (Domain of Application System Managers (SMs and/or PMs) ATTRIBUTE TABLE COLUMN (Data Architects / Modelers Domain) DBMS DEPENDENT & APPLICATION VIEW INDEPENDENT DBMS COLUMN (Oracle, DB2, etc) INHERITS COLUMN “OPERATIONAL” DATA MODEL VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION DATA ELEMENT SUBJECT TECHNOLGY DEPENDENT & DBMS INDEPENDENT MODEL / SCHEMA COLUMN INHERITS ATTRIBUTE “IMPLEMENTED” DATA MODEL) CONCEPTUAL VALUE DOMAIN STRUCTURE BUSINESS INFORMATION APPLICATION SYSTEM VIEW DBMS SCHEMA DBMS TABLE DBMS COLUMN VIEW COLUMN STRUCTURE TYPE (Domain of Database Administrators (DBAs)) METADATA REPOSITORY ISO 11179 Specified Model Implemented Model Operational DBMS Application View Data sharing occurs at the “operational and application” view layers. Made possible through the relationships between all layers represented by metadata in a repository that enables relating syntax, structure, and semantics from any layer to a common ISO 11179 standard representation. VIEW COLUMN VIEW COLUMN STRUCTURE VIEW COLUMN STRUCTURE PROCESS 20 Design Layers for a Business Information System Data Architecture Layer 0 CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPTUAL VALUE DOMAIN CONCEPT DATA ELEMENT CONCEPT STRUCTURE ENTITY VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION DATA ELEMENT FUNCTIONAL AREA / SUBJECT MATTER DEPENDENT DATA MODEL TEMPLATES SUBJECT CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE VALUE DOMAIN DATA ELEMENT CONCEPT Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) CONCEPTUAL VALUE DOMAIN STRUCTURE Layer 1 ATTRIBUTE “SPECIFIED” DATA MODEL LAYER (DoD FDAd Domain) TECHNOLGY DEPENDENT (SQL, COBOL, ETC), FULLY ATTRIBUTED, LOGICAL DATA MODEL DoD CORE DATA ELEMENT METADATA REPOSITORY ISO 11179 Model Layer Layer 2 SCHEMA TABLE COLUMN “IMPLEMENTED” DATA MODEL LAYER (Data Modelers Domain) Specified Model Layer Implemented Model Layer DBMS DEPENDENT (Oracle, DB2, Sybase, etc) DBMS DATA MODEL “OPERATIONAL” DATA MODEL LAYER (Domain of Database Administrators (DBAs)) BUSINESS APPLICATION SYSTEM “VIEW” DATA MODEL LAYER (Domain of Application System Managers (SMs and/or PMs) Operational DBMS Layer DBMS SCHEMA BUSINESS INFORMATION SYSTEM DBMS TABLE APPLICATION VIEW TABLE DBMS COLUMN APPLICATION VIEW COLUMN Layer 3 Application View Layer Layer 4 21 META MODEL ARCHITECTURE SUPPORTING ENTERPRISE WIDE SHARED DATA CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPTUAL VALUE DOMAIN CONCEPT DATA ELEMENT CONCEPT DATA ELEMENT CONCEPT STRUCTURE Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) FUNCTIONALLY DEPENDENT & TECHNOLOGY INDEPENDENT DATA MODEL TEMPLATES ATTRIBUTE INHERITS DATA ELEMENT “SPECIFIED” DATA MODEL (DoD FDAd Domain) TECHNOLGY DEPENDENT & DBMS INDEPENDENT MODEL / SCHEMA COLUMN INHERITS ATTRIBUTE “IMPLEMENTED” DATA MODEL) SCHEMA CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE VALUE DOMAIN APPLICATION VIEWS OF DBMS TABLES & COLUMNS “VIEW” DATA MODEL (Domain of Application System Managers (SMs and/or PMs) ATTRIBUTE TABLE VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION DATA ELEMENT ENTITY SUBJECT CONCEPTUAL VALUE DOMAIN STRUCTURE COLUMN BUSINESS INFORMATION APPLICATION SYSTEM (Data Architects / Modelers Domain) DBMS DEPENDENT & APPLICATION VIEW INDEPENDENT DBMS COLUMN (Oracle, DB2, etc) INHERITS COLUMN “OPERATIONAL” DATA MODEL VIEW DBMS SCHEMA DBMS TABLE DBMS COLUMN VIEW COLUMN STRUCTURE TYPE (Domain of Database Administrators (DBAs)) METADATA REPOSITORY ISO 11179 Specified Model Implemented Model Operational DBMS Application View Data sharing occurs at the “operational and application” view layers. Made possible through the relationships between all layers represented by metadata in a repository that enables relating syntax, structure, and semantics from any layer to a common ISO 11179 standard representation. VIEW COLUMN VIEW COLUMN STRUCTURE VIEW COLUMN STRUCTURE PROCESS 22 An Optimal Application of an ISO 11179 Based Data Element Architecture for Resolving Disparate Representations of Shared Enterprise Data Elements Metadata Repository Architecture of Related Representations of DoD Enterprise Shared Data Elements in Support of Data and Information Sharing ISO 11179 Context Inde pendent Data Element Representation Meta Model Concepts Materiel Resource Data Element Concept Conceptual Value Domain Business Fact Semantic Template Name Physical Item Balance Data Element Supply Item Resource Quantity Physical Measure Quantity Value Domain Functional/Organizational Context Dependent “Specified” Model Army Logistics Management Technology Dependent “Implemented” Model ANSI SQL Data type characteristics, local definition, enumerated values ( if specific ), etc. Supply Item Resource Quantity Supply Item Resource Quantity Supply Item Resource Quantity Attribute Names SQL Column Names DBMS Column Names View Column Names Supply Item Resource Quantity Navy Logistics Management Supply Item Resource Quantity ANSI SQL Additional Data Element Structural Metadata: “Oracle” DBMS Supply Item Resource Quantity Data Element Definition: The quantity of each type of Federal Supply System materiel item contained in an identifiable inventory of materiel objects. Business Application Vendor Dependent Information System (AIS) “View” Model SQL DBMS “Operational” Model Army SAMS (AIS) METADATA REPOSITORY Defense Data Dictionary System (DoD Standardized Data Elements) 17000+ SDEs Supply Item Resource Quantity “Sybase” DBMS Supply Item Resource Quantity Navy UADPS (AIS) 23 The “Optimal” Solution 24 Example Logistics Application of an ISO 11179 Based Data Element Architecture for Relating Disparate Representations of Shared Enterprise Data Elements Metadata Repository Architecture of Related Representations of DoD Enterprise Shared Data Elements in Support of Data and Information Sharing ISO 11179 Context Inde pendent Data Element Representation Meta Model Concepts Materiel Resource Data Element Concept Conceptual Value Domain Business Fact Semantic Template Name Physical Item Balance Data Element Supply Item Resource Quantity Physical Measure Quantity Value Domain Functional/Organizational Context Dependent “Specified” Model Army Logistics Management Technology Dependent “Implemented” Model ANSI SQL Data type characteristics, local definition, enumerated values ( if specific ), etc. Materiel Unit Inventory Quantity Supply Unit Quantity Mat_Inv_Qty Attribute Names SQL Column Names DBMS Column Names View Column Names Materiel Item Inventory Quantity Navy Logistics Management Mat_Itm_Inv_Qt ANSI SQL Additional Data Element Structural Metadata: “Oracle” DBMS Materiel Inventory Quantity Data Element Definition: The quantity of each type of Federal Supply System materiel item contained in an identifiable inventory of materiel objects. Business Application Vendor Dependent Information System (AIS) “View” Model SQL DBMS “Operational” Model Army SAMS (AIS) CORE METADATA REPOSITORY ISO 11179 Model Stocked Materiel Quantity “Sybase” DBMS Ships Stores Quantity Navy UADPS (AIS) Specified Model Implemented Model Operational DBMS Application View 25 Example Personnel Application of an ISO 11179 Based Data Element Architecture for Relating Disparate Representations of Shared Enterprise Data Elements Metadata Repository Architecture of Related Representations of DoD Enterprise Shared Data Elements in Support of Data and Information Sharing ISO 11179 Context Inde pendent Data Element Representation Meta Model Concepts Human Resource Conceptual Value Domain Data Element Concept Business Fact Semantic Template Name Personnel Classification Functional/Organizational Context Dependent “Specified” Model Army Personnel Management Technology Dependent “Implemented” Model Business Application Vendor Dependent Information System (AIS) “View” Model SQL DBMS “Operational” Model Army (AIS) “Oracle” DBMS Unit Member Rank Code Squad Member Rank Code ANSI SQL Sold_Rnk_Cd Soldier Rank Code Data Element Personnel Ranking Measure Person Grade Code Grade Code Value Domain Attribute Names SQL Column Names Navy Personnel Management Sail_Rat_Cde ANSI SQL Additional Data Element Structural Metadata: Data type characteristics, etc. View Column Names Sailor Rating Code Data Element Definition: The code that represents the level of authority and responsibility occupied by Person in a hierarchy of levels ranging from most superior to most subordinate in which each level is subordinate to levels above and superior to levels below. DBMS Column Names CORE METADATA REPOSITORY ISO 11179 Model Crew Member Rating Code “Sybase” DBMS Launch Team Member Rating Code Navy (AIS) Specified Model Implemented Model Operational DBMS Application View 26 Part II: Implementing the Metadata Architecture For Enterprise-wide Data Sharing in a Legacy System Environment 27 Table of Contents 1. A Realistic and Practical Approach to Achieve the Ideal Solution 2. How Do We Get to the Ideal? 3. Find Our Metadata 4. Perform Smart Meta-Data Mining 5. Find the Right Starting Layer 6. Reverse Engineer to Build the Upper Layers 7. Overall Forward Engineering Process 8. Process Statistics 9. Lessons Learned 28 1.0 META MODEL ARCHITECTURE SUPPORTING ENTERPRISE WIDE SHARED DATA CONCEPT STRUCTURE TYPE CONCEPT STRUCTURE DATA ELEMENT CONCEPT STRUCTURE TYPE CONCEPTUAL VALUE DOMAIN CONCEPT DATA ELEMENT CONCEPT DATA ELEMENT CONCEPT STRUCTURE Functionally Independent Business Fact Semantic Templates (Globally Shared Data Elements) (Domain of DoD Data Administration) FUNCTIONALLY DEPENDENT & TECHNOLOGY INDEPENDENT DATA MODEL TEMPLATES ATTRIBUTE INHERITS DATA ELEMENT “SPECIFIED” DATA MODEL (DoD FDAd Domain) TECHNOLGY DEPENDENT & DBMS INDEPENDENT MODEL / SCHEMA COLUMN INHERITS ATTRIBUTE “IMPLEMENTED” DATA MODEL) SCHEMA CONCEPTUAL VALUE DOMAIN STRUCTURE TYPE VALUE DOMAIN STRUCTURE VALUE DOMAIN APPLICATION VIEWS OF DBMS TABLES & COLUMNS “VIEW” DATA MODEL (Domain of Application System Managers (SMs and/or PMs) ATTRIBUTE TABLE VALUE DOMAIN STRUCTURE TYPE ISO 11179 BUSINESS CONTEXT INDEPENDENT DATA ELEMENT REPRESENTATION DATA ELEMENT ENTITY SUBJECT CONCEPTUAL VALUE DOMAIN STRUCTURE COLUMN BUSINESS INFORMATION APPLICATION SYSTEM (Data Architects / Modelers Domain) DBMS DEPENDENT & APPLICATION VIEW INDEPENDENT DBMS COLUMN (Oracle, DB2, etc) INHERITS COLUMN “OPERATIONAL” DATA MODEL VIEW DBMS SCHEMA DBMS TABLE DBMS COLUMN VIEW COLUMN STRUCTURE TYPE (Domain of Database Administrators (DBAs)) METADATA REPOSITORY ISO 11179 Specified Model Implemented Model Operational DBMS Application View Data sharing occurs at the “operational and application” view layers. Made possible through the relationships between all layers represented by metadata in a repository that enables relating syntax, structure, and semantics from any layer to a common ISO 11179 standard representation. VIEW COLUMN VIEW COLUMN STRUCTURE VIEW COLUMN STRUCTURE PROCESS 29 1.1 Example Personnel Application of an ISO 11179 Based Data Element Architecture for Relating Disparate Representations of Shared Enterprise Data Elements Metadata Repository Architecture of Related Representations of DoD Enterprise Shared Data Elements in Support of Data and Information Sharing ISO 11179 Context Inde pendent Data Element Representation Meta Model Concepts Human Resource Conceptual Value Domain Data Element Concept Business Fact Semantic Template Name Personnel Classification Functional/Organizational Context Dependent “Specified” Model Army Personnel Management Technology Dependent “Implemented” Model Business Application Vendor Dependent Information System (AIS) “View” Model SQL DBMS “Operational” Model Army (AIS) “Oracle” DBMS Unit Member Rank Code Squad Member Rank Code ANSI SQL Sold_Rnk_Cd Soldier Rank Code Data Element Personnel Ranking Measure Person Grade Code Grade Code Value Domain Attribute Names SQL Column Names Navy Personnel Management Sail_Rat_Cde ANSI SQL Additional Data Element Structural Metadata: Data type characteristics, etc. View Column Names Sailor Rating Code Data Element Definition: The code that represents the level of authority and responsibility occupied by Person in a hierarchy of levels ranging from most superior to most subordinate in which each level is subordinate to levels above and superior to levels below. DBMS Column Names CORE METADATA REPOSITORY ISO 11179 Model Crew Member Rating Code “Sybase” DBMS Launch Team Member Rating Code Navy (AIS) Specified Model Implemented Model Operational DBMS Application View 30 2. How Do We Get to the Ideal? (or the least un-ideal) • Find our metadata • Perform smart metadata mining • Pick the right starting layer • Reverse engineer to build the upper layers • Forward engineer to build standard-data based applications 31 3. Find Our Metadata • Existing schemas within running applications as that’s the only place where data-truth resides • Extract Cobol FDs within running applications for the same truth reason • Finally, research metadata libraries like ERwin models 32 3.1 Where We Started • DoD had 493 (Erwin) data models that were developed in the 1990s. There were 5709 tables and 16921 columns in these tables. • We did not inventory each DoD Agency, but the key investigator (Hank Lavender) is very much aware of what, where, and how much all the schemas overlapped. • This effort was to “prove the process”. We will soon start real Enterprise-wide data sharing projects. 33 4. Perform Smart Meta-Data Mining • Pick backbone and rib-cage (HR, Finance, Inventory Customer Management, Sales) Applications • Pick the most commonly used schemas across the enterprise that support the backbone and rib-cage applications • Pick the subset of schemas that have the most commonly used tables (note: commonly used is different from exactly the same as…) • Make Where-Used and Frequency-Used Matrices 34 4.1 Where Used & Frequency Matrices Basic Types and Populations IDM Data Model Data Model Description Tables Counts Columns Relationships C-03 Budgets & Currency 56 178 53 C3-12 Command & Control 28 276 65 ES-07 Environmental Hazards 41 464 46 ES-08 Environmental Projects 28 185 40 LG-06 Transportation Operations 19 91 26 LG-23 Materiel Documentation 36 272 51 LG-28 Materiel Characteristics 45 225 48 PR-22 Training & Instruction 20 135 23 PR-31 Person Characteristics 36 118 41 309 1944* 393 Totals *542 Unique Data Element Concepts 35 4.1 (cont) Subject Areas Use Across IDM Schemas SDM IDM Schemas Subject Areas C-03 C3-12 ES-07 ES-08 LG-06 LG-23 LG-28 PR-22 PR-31 Environmental Management Health Management X Logistics Management X X X X X X Logistics Operations Logistics Planning Materiel Maintenance Materiel Management Transportation Operations Property Management Personnel Management Management Administration X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X X 36 4.1 (cont) Where Used & Frequency Matrices Frequency of Use Matrix SDM Subject Area Entity Logistics Management Personnel Management Logistics Operations Logistics Operations Management Administration Property Management Logistics Planning Environmental Management Property Management IDM Schemas & Tables C-03 C3-12 ES-07 ES-08 LG-06 LG-23 LG-28 PR-22 PR-31 Organization X Person X Country X Location X Task X X X X X X X X X X X X X X X X X X Facility X X Plan X X Guidance X X Geolocation X X X 37 5. Find the Right Starting Layer Data Modeling Layer Description Start Here ? Data Elements Context independent business fact NO, these are not database semantic templates models and have no context Specified Data Model Technology independent data model templates NO, as these are just templates and not database models Implemented Data Model DBMS independent database data models and hosts for database object classes OK- if you have “Erwin” like data models that can be researched, tabulated, and extracted via Excel or SQL DDL Operational Data Model DBMS dependent and Operating System specific YES! This is the best as it matches the reality of operating databases and applications View Data Model Database application specific SQL views No, as this is too application-use specific, and not data model centric. 38 6. Reverse Engineer to Build The Upper Layers • Import to appropriate layer • Promote to higher data modeling layer • Re-engineer the Specified Data Model layer • Analyze to discover the Data Elements • Build Data Element Model Metadata layer 39 6.1 Import to Appropriate Layer Importing SQL DDL IDM 40 6.1 Import to Appropriate Layer IDM Tables IDM 41 6.2 Promote to Higher Data Modeling Layer Promote IDM to SDM 42 6.2 Key Promotion Issues Key Difference Between Subject-Entity-Attribute vs Schema-Table-Column Model of a subject area. Intellectual boundaries, not data processing boundaries. Not a conceptual version Subject-Entity-Attribute of a logical database. It’s a subject based data model (SDM) template. Define once, use many times, differently in IDM models. Schema-Table-Column (IDM) Subject-Schema Entity-Table Attribute-Column Model of a database schema that may involve attributes from multiple entities in one table, or attributes of entities across multiple tables. Intended to be implemented within a DBMS as an operational database. Not related. This would then mean Transformational Relationship. Not related. This would mean Transformational Relationship Yes, Related. This allows define once, use many times modeling. 43 6.3 Re-engineer the Specified Data Model • • • • • • Assign Entities to different Subjects Reassign Entities to within Entities (sub-typing) Reassign Attribute’s Semantics Conform Attribute Names to Subject Area Scope Reassign Attributes to different Entities Reassign Attributes to different Data Elements SDM Reassign Entity to Subject 44 6.3 Re-engineering the Specified Data Model BASIC PROCESSES Reassign Entity to Subject Reassign Attribute to Data Element SDM Assign Attribute Meta Category Values Reassign Attribute to Entity 45 6.4 Reallocate Foreign Keys to Encapsulate Subject’s Entities For Each Subject Area: SDM • Make a List of Entities • Make a Subject Area Based E-R Model Diagram • Delete Unnecessary Foreign Keys from Existing Entities • Make New Foreign Keys Where Needed • Export to E-R Diagrammer to Verify Result • Recycle if Necessary 46 6.4 (cont) Reallocate Foreign Keys to Encapsulate Subject’s Entities Validate/Create SDM Foreign Keys SDM 47 6.4 (cont) Reallocate Foreign Keys to Encapsulate Subject’s Entities Modify SDM Foreign Keys SDM 48 6.5 Discover the Data Element Promote SDM Attributes to Data Elements 49 6.6 Build Data Element Model Level Metadata 50 6.7 Data Element, Attribute, Column Differences Key Differences Among Data Element, Attribute, Column Characteristic Data Element Attribute Column Context Reason for existence A characteristic of a Stand-alone independent A characteristic of an entity that exists within table that exists within business fact template. the context of a subject. the context of a schema. Source of value based Source of semantics and Source of value based common general meaning refinement of the intent refinement of the of the entity. The set of intent of the table. The for classes of attributes all attributes fully define set of all columns fully and columns. an instance of an entity. define an instance of a table. Frequency of use Defined once within the Defined once within Define once, use many the context of a table. times to provide semantics context of an entity. to attributes or columns. source of business facts across one or more columns within one or more tables. Example Identifier Asset Identifier Person Identifier Customer Identifier Invoice Line Item • Part Number • Salesman Identifier • Customer Identifier 51 7.0 Overall Forward Engineering Process • Import from higher level to lower level • Map IDM to ODM legacy schemas to preserve existing systems environment and/or Generate new ODM schemas to replace legacy systems • SQL Views can support legacy names or new names • Generate Application 52 7.1 Import From Higher Level To Lower Level Subject Area Data Model to Implemented Data Model • • • • • • • • • • Start Metabase IDM Make the Target Schema Pick an SDM Subject Select the Root Entity Create the Data Model Entity Tree Perform Import (from SDM to IDM) “Prune” Schema-Table Set to Just Those Needed “Prune” Table-Column Set to Just Those Needed Move Columns Among Tables as Needed Import Next SDM Model and Perform “Pruning” Steps • Mapping to New IDM from SDM Preserved-----Of Course! 53 7.1 (cont) Import From Higher Level To Lower Level Subject Area Data Model to Implemented Data Model Import SDM Entities to IDM Tables 54 7.1 (cont) Import From Higher Level To Lower Level Implemented Data Model to Operational Data Model • • • • • • • • • • Start Metabase ODM Make the Target DBMS Schema Pick an IDM Schema Select the Root Table Create the Data Model Table Tree Perform Import (from IDM to ODM) “Prune” DBMS Schema-DBMS Table Set to Just Those Needed “Prune” DBMS Table-DBMS Column Set to Just Those Needed Move DBMS Columns Among DBMS Tables as Needed Import Next IDM Model and Perform “Pruning” Steps • Mapping to New ODM from IDM Preserved-----Of Course! 55 7.1 (cont) Import From Higher Level To Lower Level Implemented Data Model to Operational Data Model Import IDM Schema Tables to ODM DBMS Tables 56 7.2 Generate SQL Generate DBMS SQL DDL ODM 57 7.3 Generate Application ODM 58 8. Process Statistics Task Name Notes Effort Metric 1 Find the Right Starting Point With MS/Access based repository of data models. Close to about 100 models 2 days 2 Import to appropriate layer Had to fix a number of data modeling errors in source CASE tool 40 hours for 10+ data models 3 Promote to Higher Data Modeling Layer Required several cycles of distilling subjects 40 hours for 10+ models 4 Re-Engineer Had to re-engineer Fkeys, rename entities and some attributes. Also had to reconnect new attributes to “old columns.” 240 hours for 10+ models 59 8. (cont) Process Statistics Task Name Notes Effort Metric 5 Abstract to Data Element Required review of each attribute, and creation of MCVs, etc. 0.25 hours per attribute for 1000 attributes. So, 250 hours. 6 Build Data Element Model Level Metadata Required generation of higher level concepts, value domains, etc. 80 hours 7 Import from higher level to lower level Required design of new data models for new databases from data model templates, and/or just re-mapping to existing models 8 hours per model for 10 existing models and for 2 new models. Thus, 100 hours 8 Generate SQL Required specification of data types and lengths 20 columns per hour for 80 hours. 9 Input to and then Generate Application Export of one Model from IDM 1 hour to export, 1 hour to generate 1st cut application 60 9. Lessons Learned • It can be done. However, it is not a walk in the park! • It requires clear understanding of separation of Data Models. Data Element from Specified DMs, from Implemented DMs, and from Operational DMs. These are NOT transformations (conceptual to logical to physical). These are different data models. • Subject Matter Experts are Essential, Critical, and Absolutely Necessary. • It’s not top down. It’s bottom-up. But once built, use it top-down. • You must have a metadata repository and data modeling tool that works at the enterprise level, and not just at the database or data model level. • We made changes to the metadata repository system along the way. So, being able to change the meta model, entry and update and reports, is essential. • Given that Entity reuse for just these ODS models was about 4x, the value for the data model template reuse in data warehouses and data marts is incalculable. 61 THANK YOU Michael M. Gorman Founder and President 2008 Althea Lane Bowie, Maryland 20716-1518 Phone: +1.301.249.1142 Fax: +1.301.249.8955 Email: mmgorman@wiscorp.com WWWeb: <http://www.wiscorp.com> Inc Hank Lavender Senior Information Engineer 1310 Braddock Place Alexandria, Virginia 22314-1648 Phone: +1.703.836.5900 Fax: +1.703.836.8691 Email: hlavender@amerind.com WWWeb: <http://www.amerind.com> 62 Back-ups 63 Proposed DoD Metadata Repository Complete Representations of Data Element Metadata DoD CORE DATA ELEMENT METADATA REPOSITORY ISO 11179 Model Layer Data Element Metadata Relationships to Multiple Categories of Metadata Today Core DoD Enterprise Data Element Metadata Repository Specified Model Layer Implemented Model Layer Operational DBMS Layer Application View Layer The Future The Payoff Seamless & Transparent Information Interoperability DoD Enterprise Interoperability Metadata Repository 64 The Current DoD Architecture for Defining Standard Data Element Representations of Shared Enterprise Data Elements Business Application Vendor Dependent Information System (AIS) “View” Model SQL DBMS “Operational” Model Army SAMS (AIS) Functional/Organizational Context Dependent “Specified” Model Data Element Definition: Army Logistics Management The quantity of each type of Federal Supply System materiel item contained in an identifiable inventory of materiel objects. Materiel Inventory Quantity Additional Data Element Structural Metadata: Data type characteristics, local definition, numerated values ( if specific), etc. Attribute Names Technology Dependent “Implemented” Model ANSI SQL “Oracle” DBMS Materiel Unit Inventory Quantity Supply Unit Quantity Mat_Inv_Qty SQL Column Names DBMS Column Names View Column Names Materiel Item Inventory Quantity Navy Logistics Management Mat_Itm_Inv_Qt ANSI SQL METADATA REPOSITORY Defense Data Dictionary System (DoD Standardized Data Elements) 16000+ SDEs Stocked Materiel Quantity “Sybase” DBMS (SDE Access Name) Ships Stores Quantity Navy UADPS (AIS) Business Rule: Only one named representation is permitted to exist in the repository as an Enterprise SDE. 65 METABASE REPOSITORY ISO 11179 Data Element Templates Specified Data Model (SDM) Implemented Data Model (IDM) Operational DBMS Model (ODM) OUTPUT ODM LAYER SQL DBMS ODM LAYER DoD Global Information Grid (GIG) Feedback Output Schema Information XML Schema Info Tables HUMAN RESOURCES DATABASE 66