Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 9 Business Intelligence and Information Systems for Decision Making Study Questions Q1: How big is an exabyte, and why does it matter? Q2: How do business intelligence (BI) systems provide competitive advantages? Q3: What problems do operational data pose for BI systems? Q4: What are the purpose and components of a data warehouse? Q5: What is a data mart, and how does it differ from a data warehouse? Q6: What are the characteristics of data-mining systems? Q7: What are OLAP reports? Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-2 Q1: How Big Is an Exabyte? Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-3 Q1: Why Does an Exabyte Matter? • Businesses collect massive amounts of data – Storage capacity is increasing as cost is decreasing – Storage capacity is becoming almost unlimited, so businesses collect more at little extra cost • Buried in that data are important patterns of relationships that can yield valuable information to help businesses make better decisions Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-4 Q2: How Do Business Intelligence (BI) Systems Provide Competitive Advantages? • Primary BI systems: 1. Reporting systems • Integrate data from multiple systems • Sorting, grouping, summing, averaging, comparing data 2. Data-mining systems • Use sophisticated statistical techniques, regression analysis, and decision tree analysis • Used to discover hidden patterns and relationships • Market-basket analysis –purchasing patterns Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-5 Q2: How Do Business Intelligence (BI) Systems Provide Competitive Advantages? (cont’d) 3. Knowledge management systems (KMs) • Create value by collecting and sharing human knowledge about products, products uses, best practices, other critical knowledge • Used by employees, managers, customers, suppliers, others who need access to company knowledge 4. Expert systems (ES) • Encapsulates knowledge in form of “If/Then” rules – If Patient_Temp > 103, Then start High_Fever_Procedure • ES can improve diagnostic and decision quality of nonexperts Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-6 Q3: What Problems Do Operational Data Pose for BI Systems? • Dirty data. – – – – Values may be missing Inconsistent data Non-integrated data Wrong granularity (Coarse vs. Fine) • Too much data causes: 1. Curse of dimensionality • 2. Problem caused by the exponential increase in volume associated with adding extra dimensions to a (mathematical) space. Too many rows or data points Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-7 Q4: What are the Purpose and Components of a Data Warehouse? • Purpose: – To extract and clean data from various operational systems and other sources – To store and catalog data for BI processing Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-8 Q4: What are the Purpose and Components of a Data Warehouse?(cont.) Data warehouse architecture consists of the following layers: • Operational database layer - The source data for the data warehouse - An organization's Enterprise Resource Planning systems fall into this layer. • Data access layer - The interface between the operational and informational access layer - Tools to extract, transform, load (ETL) data into the warehouse fall into this layer. • Metadata layer - The data directory - This is usually more detailed than an operational system data directory. • Informational access layer - The reporting and analyzing tools. Business intelligence tools fall into this layer. Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-9 Components of a Data Warehouse Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-10 Q5: What Is a Data Mart, and How Does It Differ from a Data Warehouse? • Created to address particular needs – Business function – Problem – Opportunity • Smaller than data warehouse • Data extracted from data warehouse for a functional area Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-11 Components of a Data Mart Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-12 Q6: What Are the Characteristics of Data-Mining Systems? • Data mining—application of statistical techniques to find patterns and relationships in body of data for purpose of classifying and predicting Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-13 Unsupervised Data Mining • Analysts do not create model before running analysis • Apply data-mining technique and observe results • Hypotheses created after analysis as explanation for results • Common statistical technique used: – Cluster analysis to identify groups with similar characteristics Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-14 Supervised Data Mining • Model developed before analysis • Statistical techniques used to estimate parameters • Examples: – Regression analysis—measures impact of set of variables on one another – Used for making predictions Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-15 Supervised Data Mining (cont’d) Neural networks • Used for predicting values and making classifications • Complicated set of nonlinear equations Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-16 Q7: What are OLAP reports? • OnLine Analytical Processing – is an approach to quickly answer multidimensional analytical queries – Dynamic online view based on • Measures – Data item to be manipulated – total sales, average cost • Dimensions – Characteristic of measure – purchase date, customer type, location, sales region Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE17-17 Q7: What are OLAP reports? (cont.) • OLAP cube – Presentation of measure with associated dimensions (a.k.a. OLAP report) • Users can alter format • Users can drill down into data – Divide data into more detail Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE17-18 Role of OLAP Server and OLAP Database • Figure CE17-12 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE17-19 Active Review Q1: How big is an exabyte, and why does it matter? Q2: How do business intelligence (BI) systems provide competitive advantages? Q3: What problems do operational data pose for BI systems? Q4: What are the purpose and components of a data warehouse? Q5: What is a data mart, and how does it differ from a data warehouse? Q6: What are the characteristics of data-mining systems? Q7: What are OLAP reports? Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall 9-20 Chapter Extension 16 Database Marketing Study Questions Q1: What is a database marketing opportunity? Q2: How does RFM analysis classify customers? Q3: How does market-basket analysis identify cross-selling opportunities? Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-22 Q1: What Is a Database Marketing Opportunity? • Database marketing – Application of business intelligence systems for planning and executing marketing programs – Databases are a key component – Data-mining techniques also important • Process of sorting through large amounts of data and picking out relevant information Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-23 Q2: How Does RFM Analysis Classify Customers? RFM – – – – RFM program analyzes and ranks customers according to their purchase patterns How recently (R) a customer has ordered? How frequently (F) a customer has ordered? How much money (M) a customer has spent per order? Divides customers into five groups and assigns a score • R score 1 = top 20% in most recent orders • R score 5 = bottom 20% (longest since last order) • F score 1 = top 20% in most frequent orders • F score 5 = bottom 20% least frequent orders • M score 1 = top 20% in most money spent • M score 5 = bottom 20% in amount of money spent Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-24 Example of RFM Score Data • Figure CE16-1 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-25 Q3: How Does Market-Basket Analysis Identify Cross-Selling Opportunities? • Market-basket analysis is a data-mining technique for determining sales patterns – Uses statistical methods to identify sales patterns in large volumes of data – Shows which products customers tend to buy together – Helps identify cross-selling opportunities • "Customers who bought book X also bought book Y” Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-26 Market-Basket Example • Figure CE16-2 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-27 Market-Basket Terminology • Support Probability that two items will be bought together – Fins and masks purchases together 150 times, thus support for fins and a mask is 150/1,000, or 15% – Support for fins and weights is 60/1,000, or 6% – Support for fins along with a second pair of fins is 10/1,000, or 1% Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-28 Market-Basket Terminology (cont’d) • Confidence What proportion of the customers who bought a mask also bought fins? – Conditional probability estimate – Example: » Probability of buying fins = 28% (280/1000) » Probability of buying swim mask = 27% (270/1000) – After buying Mask, » Probability of buying Fins = 150/270 or 55.56% Likelihood that a customer will also buy fins almost doubles, from 28% to 55.56%. Thus, all sales personnel should try to sell fins to anyone buying a mask Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-29 Market-Basket Terminology (cont’d) • Lift Ratio of confidence to base probability of buying item – Shows how much base probability increases or decreases when other products are purchased • Example: – Lift of fins and a mask is confidence of fins given a mask, divided by the base probability of fins. – Lift of fins and a mask is .5556/.28 = 1.98 Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-30 Decision Tree for Loan Evaluation • Common business application – – – – – Classify loan applications by likelihood of default Rules identify loans for bank approval Identify market segment Structure marketing campaign Predict problems • Too bad the Banks didn’t use decision trees!! Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-31 Active Review Q1: What is a database marketing opportunity? Q2: How does RFM analysis classify customers? Q3: How does market-basket analysis identify cross-selling opportunities? Copyright © 2010 Pearson Education, Inc. Publishing as Prentice Hall CE16-32 Chapter 9 Ch Ext.16 Business Intelligence and Information Systems for Decision Making