Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
1 Business Intelligence through Data Mining with Daniel L. Silver Dalhousie University Copyright (c), 1999 All Rights Reserved CogNova Technologies 2 About myself ... Ph.D. in Comp. Sci./Machine Learning, UWO Chair-Associate, Business Informatics, Faculty of Management, Dalhousie University Founder of CogNova Technologies (London, 1993) London Health Science Center, 3M, London Life, MT&T, NSPI, QEII Health Science Center My Objective ... To discuss data warehousing and data mining within the context of knowledge management and business intelligence. Dalhousie University CogNova Technologies 3 CogNova Technologies Offers Consultation - situation analysis and requirements definition, selection of third party systems, project management, and trouble shooting Services - installation and application of third party software, data analysis and model generation using CogNova proprietary systems, summary and analysis of results Education - courses and seminars on the theory and application of data mining technologies, and the knowledge discovery process Research - investigation and development of advanced machine learning systems and the application of KDD practices Dalhousie University CogNova Technologies 4 Outline Introduction Knowledge Management and Business Intelligence Knowledge Discovery Process Data Warehousing and Data Mining Opportunities, Benefits, Costs Dalhousie University CogNova Technologies 5 Introduction - The Buzz Words Hype vs. Reality Knowledge Management Business Intelligence Data Warehouse, Corp. Repository, Data Mart Knowledge Creation or Discovery Data Mining Dalhousie University CogNova Technologies 6 Introduction - Motivation Competition Global Opportunities Employee Turn-over Organization Customer Demands Technological Change Regulatory Change Dalhousie University CogNova Technologies 7 Introduction - Rationale Employees Gov’t Reg. Management of Organizational Knowledge Products Services Customers Channels Competitors Partners Suppliers Dalhousie University CogNova Technologies 8 The Knowledge Management Cycle “Business Intelligence” Environmental data Observation and Analysis Problems Opportunities Information Knowledge Consolidation INFORMATION Storage Processing Communication Theory Generation Approach Methods Results Testing and Application Dalhousie University CogNova Technologies 9 KM and Business Intelligence Why should it matter to you? Knowledge becoming substantial asset Maximum sharing of information Employees leave, business value remains Betterment of internal and external structures, personal competencies Competitive advantage - leading organizations now adopting Dalhousie University CogNova Technologies 10 KM and Business Intelligence Key Solution Components: Internet / Intranet & Groupware Document management systems EDI - Electronic Data Interchange E-Commerce methods Data Warehousing Data Mining Dalhousie University CogNova Technologies 11 Knowledge Management information => Technology Centred Info. Technologists info. and comp. sciences, database, telecomm., analysis KM = objects explicit knowledge easily encoded Dalhousie University <= people People Centred Org. Theorists org. behavior, group dynamics, HCI, psychology KM = process tacit knowledge difficult to encode CogNova Technologies 12 Knowledge Management Intellectual Capital Human Capital = Knowledge + Capabilities + Skill Structural Capital = Everything that remains after the employees go home Intellectual Capital = Human Capital + Structural Capital Intellectual Capital = Market Value - Book Value (e.g. Microsoft’s MV = 15 * BV) Dalhousie University CogNova Technologies 13 Knowledge Management The Invisible Balance Sheet Dalhousie University Cash Accounts Receivable Equipment Property Short-term Loans Long-term Debt External Structure Internal Structure Invisible Share Holder Equity Competence Obligation S.H. Equity Market Value Liability & S.H. Equity Book Value Intangible Tangible Assets CogNova Technologies 14 KM and Business Intelligence Gardner says .... Leaders - will move on intangible benefits Followers - will move only on tangible savings/profits Others - will wait and try to catch up Dalhousie University CogNova Technologies 15 KM and Business Intelligence HYPE KM is primarily technology centred: – Data Warehousing – Data Mining – Intranets – Groupware Dalhousie University REALITY KM is primarily a people centred philosophy which necessarily involves and will promote the use of such technologies CogNova Technologies 16 Knowledge Management Access to Recent Information Books: ”Working Knowledge : How Organizations Manage What They Know” T. Davenport & L. Prusak (http://www.amazon.com/exec/obidos/ASI) The Web: – http://www.brint.com/km/ – www.sveiby.com.au – knowledge management mail-list: km@MCCMEDIA.COM Dalhousie University CogNova Technologies 17 “We are drowning in information, but starving for knowledge.” John Naisbett author of Megatrends Knowledge Discovery through Data Warehousing and Data Mining Dalhousie University CogNova Technologies 18 Knowledge Discovery and Data Mining What is KDD? A Process The selection and processing of data for: – the identification of novel, accurate, and useful patterns, and – the modeling of real-world phenomenon. Data Warehousing and Data mining are major components of the KDD process Dalhousie University CogNova Technologies 19 The Knowledge Discovery Process Interpretation and Evaluation Data Mining Knowledge Selection and Preprocessing Data Warehousing Patterns & Models Warehouse Internal and External Data Sources Dalhousie University p(x)=0.02 Prepared Data Consolidated Data CogNova Technologies 20 Knowledge Discovery in Context 9 T he KD D Pro ce ss Interpretation and Evaluation D ata M ining K no w le d g e Sele ction a nd Preprocessing Problem D ata C onsolidation Knowledge p (x) = 0. 02 P a tt e r n s & M o d e ls W are h ou se P r e p a r e d D a ta C o n s o lid a te d D a ta D a ta S o u r c e s C o g N o va T e c h n o lo g i e s Identify Problem or Opportunity New Insight Dalhousie University “The Virtuous Cycle” Measure Effect of Action Act on Knowledge Results CogNova Technologies 21 Marketing Embraces KM, DW, DM Why? … Marketing Traditional Marketing MIS Data WarehousingData Mining Dalhousie University Relationship Marketing a.k.a Customer Relationship Management CogNova Technologies What is Relationship Marketing all about? 22 Knowing your customers on an individual basis Maximizing life-time value not individual sales Developing and maintaining a mutually beneficial relationship Acquire, retain, win-back desirable customers Arbuckle’s Market “ The Corner Store ” Dalhousie University CogNova Technologies 23 Knowledge Discovery What can KDD do for an organization? Impact on Marketing Target marketing at a credit card company Consumer usage analysis at a telecomm provider Loyalty assessment at a service bureau Quality of service analysis at an appliance chain Dalhousie University CogNova Technologies 24 The Knowledge Discovery Process Interpretation and Evaluation Data Mining Knowledge Selection and Preprocessing Data Warehousing Patterns & Models Warehouse Internal and External Data Sources Dalhousie University p(x)=0.02 Prepared Data Consolidated Data CogNova Technologies 25 Data Warehousing From data sources to consolidated data repository Analysis and Info Sharing RDBMS Legacy DBMS Flat Files Data Consolidation and Cleansing Warehouse or Datamart Object/Relation DBMS Multidimensional DBMS External Dalhousie University CogNova Technologies 26 Data Warehousing Operational DB Data Warehouse Application oriented Current Details Changes continually Subject Oriented Current + historical Details + Summaries Stable Major DW Framework suppliers / consultants: DMR, IBM, SHL, NCR; SAS, Oracle, Sybase Dalhousie University CogNova Technologies 27 Relationship between DW and DM? Strategic Tactical Rationale for data consolidation Analysis Data Warehousing Query/Reporting OLAP Data Mining Source of consolidated data Dalhousie University CogNova Technologies 28 Data Warehousing Must be business benefits driven It’s not a project .. It’s a way of life Keys to success are top-down strategy with bottom-up tactical deployment: – communicate vision of Data Warehouse – construct departmental Data Marts – evolve to enterprise Data Warehouse Rapid change in technology and business requirements -> demands short deployment cycles Dalhousie University CogNova Technologies 29 Data Warehousing HYPE Corporate data stored within a DW will solve all your business problems Dalhousie University REALITY The identification of business problems is the first step - DW, DM are solutions Analysis and DW will necessarily mature in parallel CogNova Technologies 30 Data Warehousing Access to Recent Information Text Books: – W.H. Inmon, Claudia Imhoff Web Pages: – DWI - The Data Warehouse Institute www.dw-institute.com – DW Information Centre pwp.starnetic.com/larryg Dalhousie University CogNova Technologies 31 The Knowledge Discovery Process Interpretation and Evaluation Data Mining Knowledge Selection and Preprocessing Data Warehousing Patterns & Models Warehouse Internal and External Data Sources Dalhousie University p(x)=0.02 Prepared Data Consolidated Data CogNova Technologies 32 Knowledge Discovery Process Core Problems & Approaches Probability Problems: of sale – identification of relevant data – representation of data – search for valid pattern or model Age Income Approaches: – top-down verification by expert – interactive visualization of data/models – * bottom-up induction from data * Dalhousie University On-Line Analytical Processing Data Mining CogNova Technologies 33 OLAP: On-Line Analytical Processing OLAP Functionality Profit Values Sales Region Dimension selection – slice & dice Rotation – allows change in perspective OLAP cube Year by Month Product Class by Product Name Filtration – value range selection Hierarchies – – Dalhousie University drill-downs to lower levels roll-ups to higher levels CogNova Technologies 34 Top-down Verification Technology DEMO Cognos - PowerPlay An On-line Analytical Processing (OLAP) System Dalhousie University CogNova Technologies 35 Overview of Data Mining Methods Discovery of patterns Marital Status clustering systems e.g. customer segmentation – Predictive modeling Age Prob. of Sale regression, neural networks e.g. target marketing, risk assessment – Descriptive modeling inductive decision trees e.g. client characterization – Dalhousie University Age if age > 45 and income < $32k then ... CogNova Technologies 36 Data Mining Technology DEMO Angoss - KnowledgeSEEKER An inductive decision tree/rule system Dalhousie University CogNova Technologies 37 Data Mining Example Health Care Situation: A life style data on 360 persons Problem: Characterize those most likely to have high/low blood pressure. Solution: Inductive Decision Tree Dalhousie University CogNova Technologies 38 Application Areas and Opportunities Finance: investment support, portfolio management Banking & Insurance: credit approval, risk assessment Marketing: segmentation, customer targeting, ... Science and medicine: hypothesis discovery, prediction, classification, diagnosis Security: bomb, iceberg, and fraud detection Manufacturing: process modeling, quality control, resource allocation Engineering: simulation and analysis, pattern recognition, signal processing Internet: smart search engines, web marketing Dalhousie University CogNova Technologies 39 The Current Status and Trends Standards and methodology lag technology Many products: – micro DM packages (Cognos, Angoss) – macro - integrated suites (SAS, IBM) Software costs have risen 1000% over 2 years Beware - major players yet to be determined KDD experts fear the hype being generated Legal and ethical issues on the horizon Internet - “the” sink and source of data Dalhousie University CogNova Technologies 40 Integrated Knowledge Discovery Suites Graphical User Interface Data Consolidation Data Sources Dalhousie University Selection and Preprocessing Warehouse Data Mining Interpretation and Evaluation Knowledge CogNova Technologies 41 Benefits of KDD Maximum utility from corporate data – discovery of new knowledge – generation of models Important feedback to data warehousing effort – identification and justification of essential data Reduction of application dev ’t backlog – model development vs. software development Effect on bottom line of organization – cost reduction, increased productivity, risk avoidance … competitive advantage Dalhousie University CogNova Technologies 42 Requirements and Costs of KDD Hardware - computationally intensive Software - micro < $20k, integrated suites < $300k Data - internal collection, surveys, external sources Human resources – DB/DP/DC expertise to consolidate and preprocess data – Machine learning and stats competence – Application knowledge & project mgmt 70% of the effort is expended on the data consolidation and preprocessing activities Dalhousie University CogNova Technologies 43 KDD and Data Mining HYPE Expensive hardware and software is always required DM is now turn-key “just give it the data” Dalhousie University REALITY Micro $2k-$10k DM packages can produce results DM is data analysis - requires business sense plus statistics and AI skills CogNova Technologies 44 Access to Recent Information Book: Data Mining Techniques for Marketing, Sales and Customer Support, by M. Berry & G. Linoff, Wiley & Sons Journal: Data Mining and Knowledge Discovery, Kluwer Publishing Conference: KDD’99 Web-pages: Bus. Informatics KDD page http://www.mgmt.dal.ca/ChrBusInf/knowdis Knowledge Discovery Mine http://www.kdnuggets.com Dalhousie University CogNova Technologies 45 THE END daniel.silver@dal.ca www3.ns.sympatico.ca/~dsilver Dalhousie University CogNova Technologies