Download Business Intelligence through Data Mining

Survey
yes no Was this document useful for you?
   Thank you for your participation!

* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project

Document related concepts

Nonlinear dimensionality reduction wikipedia , lookup

Transcript
1
Business Intelligence
through
Data Mining
with
Daniel L. Silver
Dalhousie
University
Copyright (c), 1999
All Rights Reserved
CogNova
Technologies
2
About myself ...




Ph.D. in Comp. Sci./Machine Learning, UWO
Chair-Associate, Business Informatics,
Faculty of Management, Dalhousie University
Founder of CogNova Technologies (London, 1993)
London Health Science Center, 3M, London Life,
MT&T, NSPI, QEII Health Science Center
My Objective ...
 To discuss data warehousing and data
mining within the context of knowledge
management and business intelligence.
Dalhousie
University
CogNova
Technologies
3
CogNova Technologies Offers

Consultation - situation analysis and requirements
definition, selection of third party systems, project management,
and trouble shooting

Services - installation and application of third party
software, data analysis and model generation using CogNova
proprietary systems, summary and analysis of results

Education - courses and seminars on the theory and
application of data mining technologies, and the knowledge
discovery process

Research - investigation and development of advanced
machine learning systems and the application of KDD practices
Dalhousie
University
CogNova
Technologies
4
Outline
 Introduction
 Knowledge
Management
and Business Intelligence
 Knowledge Discovery Process
 Data Warehousing and Data Mining
 Opportunities, Benefits, Costs
Dalhousie
University
CogNova
Technologies
5
Introduction - The Buzz Words
Hype vs. Reality
 Knowledge Management
 Business Intelligence
 Data Warehouse, Corp. Repository,
Data Mart
 Knowledge Creation or Discovery
 Data Mining
Dalhousie
University
CogNova
Technologies
6
Introduction - Motivation
Competition
Global
Opportunities
Employee
Turn-over
Organization
Customer
Demands
Technological
Change
Regulatory
Change
Dalhousie
University
CogNova
Technologies
7
Introduction - Rationale
Employees
Gov’t Reg.
Management of
Organizational
Knowledge
Products
Services
Customers
Channels
Competitors
Partners
Suppliers
Dalhousie
University
CogNova
Technologies
8
The Knowledge Management Cycle
“Business Intelligence”
Environmental data
Observation
and Analysis
Problems
Opportunities
Information
Knowledge
Consolidation
INFORMATION
Storage
Processing
Communication
Theory
Generation
Approach
Methods
Results
Testing and
Application
Dalhousie
University
CogNova
Technologies
9
KM and Business Intelligence
Why should it matter to you?
 Knowledge becoming substantial asset
 Maximum sharing of information
 Employees leave, business value remains
 Betterment of internal and external
structures, personal competencies
 Competitive advantage - leading
organizations now adopting
Dalhousie
University
CogNova
Technologies
10
KM and Business Intelligence
Key Solution Components:
 Internet
/ Intranet & Groupware
 Document management systems
 EDI - Electronic Data Interchange
 E-Commerce methods
 Data Warehousing
 Data Mining
Dalhousie
University
CogNova
Technologies
11
Knowledge Management
information =>
Technology Centred
 Info. Technologists
 info. and comp.
sciences, database,
telecomm., analysis
 KM = objects
 explicit knowledge easily encoded
Dalhousie
University
<= people
People Centred
 Org. Theorists
 org. behavior, group
dynamics, HCI,
psychology
 KM = process
 tacit knowledge difficult to encode
CogNova
Technologies
12
Knowledge Management
Intellectual Capital
Human Capital = Knowledge + Capabilities + Skill
Structural Capital = Everything that remains after
the employees go home
Intellectual Capital = Human Capital + Structural
Capital
Intellectual Capital = Market Value - Book Value
(e.g. Microsoft’s MV = 15 * BV)
Dalhousie
University
CogNova
Technologies
13
Knowledge Management
The Invisible Balance Sheet
Dalhousie
University
Cash
Accounts
Receivable
Equipment
Property
Short-term Loans
Long-term Debt
External Structure
Internal Structure
Invisible
Share Holder
Equity
Competence
Obligation
S.H. Equity
Market Value
Liability & S.H. Equity
Book Value
Intangible Tangible
Assets
CogNova
Technologies
14
KM and Business Intelligence
Gardner says ....
 Leaders
- will move on intangible
benefits
 Followers - will move only on tangible
savings/profits
 Others - will wait and try to catch up
Dalhousie
University
CogNova
Technologies
15
KM and Business Intelligence
HYPE

KM is primarily
technology centred:
– Data Warehousing
– Data Mining
– Intranets
– Groupware
Dalhousie
University
REALITY

KM is primarily a
people centred
philosophy which
necessarily involves
and will promote
the use of such
technologies
CogNova
Technologies
16
Knowledge Management
Access to Recent Information
 Books: ”Working Knowledge : How
Organizations Manage What They Know”
T. Davenport & L. Prusak
(http://www.amazon.com/exec/obidos/ASI)
 The
Web:
– http://www.brint.com/km/
– www.sveiby.com.au
– knowledge management mail-list:
km@MCCMEDIA.COM
Dalhousie
University
CogNova
Technologies
17
“We are drowning in information, but
starving for knowledge.” John Naisbett
author of Megatrends
Knowledge Discovery through
Data Warehousing
and
Data Mining
Dalhousie
University
CogNova
Technologies
18
Knowledge Discovery and Data Mining
What is KDD?
A Process
The selection and processing of data for:
– the identification of novel, accurate, and
useful patterns, and
– the modeling of real-world phenomenon.
 Data Warehousing and Data mining are
major components of the KDD process

Dalhousie
University
CogNova
Technologies
19
The Knowledge
Discovery Process
Interpretation
and Evaluation
Data Mining
Knowledge
Selection and
Preprocessing
Data
Warehousing
Patterns &
Models
Warehouse
Internal and External
Data Sources
Dalhousie
University
p(x)=0.02
Prepared Data
Consolidated
Data
CogNova
Technologies
20
Knowledge Discovery in Context
9
T he KD D Pro ce ss
Interpretation
and Evaluation
D ata M ining
K no w le d g e
Sele ction a nd
Preprocessing
Problem
D ata
C onsolidation
Knowledge
p (x) = 0. 02
P a tt e r n s &
M o d e ls
W are h ou se
P r e p a r e d D a ta
C o n s o lid a te d
D a ta
D a ta S o u r c e s
C o g N o va
T e c h n o lo g i e s
Identify
Problem or
Opportunity
New Insight
Dalhousie
University
“The Virtuous
Cycle”
Measure Effect
of Action
Act on
Knowledge
Results
CogNova
Technologies
21
Marketing Embraces KM, DW, DM
Why? …
Marketing
Traditional
Marketing
MIS
Data
WarehousingData Mining
Dalhousie
University
Relationship
Marketing
a.k.a
Customer
Relationship
Management
CogNova
Technologies
What is Relationship Marketing
all about?
22
Knowing your customers
on an individual basis
 Maximizing life-time
value not individual
sales
 Developing and
maintaining a mutually
beneficial relationship
 Acquire, retain, win-back
desirable customers

Arbuckle’s
Market
“ The Corner Store ”
Dalhousie
University
CogNova
Technologies
23
Knowledge Discovery
What can KDD do for an organization?
Impact on Marketing
Target marketing at a credit card company
 Consumer usage analysis at a telecomm
provider
 Loyalty assessment at a service bureau
 Quality of service analysis at an appliance
chain

Dalhousie
University
CogNova
Technologies
24
The Knowledge
Discovery Process
Interpretation
and Evaluation
Data Mining
Knowledge
Selection and
Preprocessing
Data
Warehousing
Patterns &
Models
Warehouse
Internal and External
Data Sources
Dalhousie
University
p(x)=0.02
Prepared Data
Consolidated
Data
CogNova
Technologies
25
Data Warehousing
From data sources to consolidated data
repository
Analysis and
Info Sharing
RDBMS
Legacy
DBMS
Flat Files
Data
Consolidation
and Cleansing
Warehouse
or Datamart
Object/Relation DBMS
Multidimensional DBMS
External
Dalhousie
University
CogNova
Technologies
26
Data Warehousing
Operational DB
Data Warehouse
Application oriented
 Current
 Details
 Changes continually


Subject Oriented
 Current + historical
 Details + Summaries
 Stable
Major DW Framework suppliers / consultants:
DMR, IBM, SHL, NCR; SAS, Oracle, Sybase
Dalhousie
University
CogNova
Technologies
27
Relationship between DW and DM?
Strategic
Tactical
Rationale
for data
consolidation
Analysis
Data
Warehousing
Query/Reporting
OLAP
Data Mining
Source of
consolidated
data
Dalhousie
University
CogNova
Technologies
28
Data Warehousing
Must be business benefits driven
 It’s not a project .. It’s a way of life
 Keys to success are top-down strategy with
bottom-up tactical deployment:

– communicate vision of Data Warehouse
– construct departmental Data Marts
– evolve to enterprise Data Warehouse

Rapid change in technology and business
requirements ->
demands short deployment cycles
Dalhousie
University
CogNova
Technologies
29
Data Warehousing
HYPE

Corporate data
stored within a DW
will solve all your
business problems
Dalhousie
University
REALITY
The identification of
business problems is
the first step - DW,
DM are solutions
 Analysis and DW
will necessarily
mature in parallel

CogNova
Technologies
30
Data Warehousing
Access to Recent Information
Text Books:
– W.H. Inmon, Claudia Imhoff
 Web Pages:
– DWI - The Data Warehouse Institute
www.dw-institute.com
– DW Information Centre
pwp.starnetic.com/larryg

Dalhousie
University
CogNova
Technologies
31
The Knowledge
Discovery Process
Interpretation
and Evaluation
Data Mining
Knowledge
Selection and
Preprocessing
Data
Warehousing
Patterns &
Models
Warehouse
Internal and External
Data Sources
Dalhousie
University
p(x)=0.02
Prepared Data
Consolidated
Data
CogNova
Technologies
32
Knowledge Discovery Process
Core Problems & Approaches
Probability
 Problems:
of sale
– identification of relevant data
– representation of data
– search for valid pattern or model
Age
Income
 Approaches:
– top-down verification by expert
– interactive visualization of data/models
– * bottom-up induction from data *
Dalhousie
University
On-Line
Analytical
Processing
Data
Mining
CogNova
Technologies
33
OLAP: On-Line Analytical Processing
OLAP Functionality
Profit Values

Sales
Region
Dimension selection
– slice & dice

Rotation
– allows change in perspective
OLAP
cube

Year
by Month
Product Class
by Product Name
Filtration
– value range selection

Hierarchies
–
–
Dalhousie
University
drill-downs to lower levels
roll-ups to higher levels
CogNova
Technologies
34
Top-down Verification
Technology
DEMO
Cognos - PowerPlay
An On-line Analytical Processing
(OLAP) System
Dalhousie
University
CogNova
Technologies
35
Overview of Data Mining Methods
 Discovery
of patterns
Marital
Status
clustering systems
e.g. customer segmentation
–
 Predictive
modeling
Age
Prob.
of Sale
regression, neural networks
e.g. target marketing, risk assessment
–
 Descriptive
modeling
inductive decision trees
e.g. client characterization
–
Dalhousie
University
Age
if age > 45
and income < $32k
then ...
CogNova
Technologies
36
Data Mining Technology
DEMO
Angoss - KnowledgeSEEKER
An inductive decision tree/rule
system
Dalhousie
University
CogNova
Technologies
37
Data Mining Example
Health Care
Situation: A life style data on 360 persons
Problem: Characterize those most likely
to have high/low blood pressure.
Solution: Inductive Decision Tree
Dalhousie
University
CogNova
Technologies
38
Application Areas and Opportunities








Finance: investment support, portfolio management
Banking & Insurance: credit approval, risk assessment
Marketing: segmentation, customer targeting, ...
Science and medicine: hypothesis discovery,
prediction, classification, diagnosis
Security: bomb, iceberg, and fraud detection
Manufacturing: process modeling, quality control,
resource allocation
Engineering: simulation and analysis, pattern
recognition, signal processing
Internet: smart search engines, web marketing
Dalhousie
University
CogNova
Technologies
39
The Current Status and Trends
Standards and methodology lag technology
 Many products:
– micro DM packages (Cognos, Angoss)
– macro - integrated suites (SAS, IBM)
 Software costs have risen 1000% over 2 years
 Beware - major players yet to be determined
 KDD experts fear the hype being generated
 Legal and ethical issues on the horizon
 Internet - “the” sink and source of data

Dalhousie
University
CogNova
Technologies
40
Integrated Knowledge Discovery Suites
Graphical User Interface
Data
Consolidation
Data Sources
Dalhousie
University
Selection
and
Preprocessing
Warehouse
Data
Mining
Interpretation
and Evaluation
Knowledge
CogNova
Technologies
41
Benefits of KDD

Maximum utility from corporate data
– discovery of new knowledge
– generation of models

Important feedback to data warehousing effort
– identification and justification of essential data

Reduction of application dev ’t backlog
– model development vs. software development

Effect on bottom line of organization
– cost reduction, increased productivity, risk
avoidance … competitive advantage
Dalhousie
University
CogNova
Technologies
42
Requirements and Costs of KDD
Hardware - computationally intensive
 Software - micro < $20k, integrated suites < $300k
 Data - internal collection, surveys, external sources
 Human resources

– DB/DP/DC expertise to consolidate and
preprocess data
– Machine learning and stats competence
– Application knowledge & project mgmt
 70% of the effort is expended on the data
consolidation and preprocessing activities
Dalhousie
University
CogNova
Technologies
43
KDD and Data Mining
HYPE
Expensive hardware
and software is
always required
 DM is now turn-key
“just give it the data”

Dalhousie
University
REALITY
Micro $2k-$10k
DM packages can
produce results
 DM is data analysis
- requires business
sense plus statistics
and AI skills

CogNova
Technologies
44
Access to Recent Information
Book: Data Mining Techniques for
Marketing, Sales and Customer Support,
by M. Berry & G. Linoff, Wiley & Sons
 Journal: Data Mining and Knowledge
Discovery, Kluwer Publishing
 Conference: KDD’99
 Web-pages: Bus. Informatics KDD page

http://www.mgmt.dal.ca/ChrBusInf/knowdis
Knowledge Discovery Mine
http://www.kdnuggets.com
Dalhousie
University
CogNova
Technologies
45
THE END
daniel.silver@dal.ca
www3.ns.sympatico.ca/~dsilver
Dalhousie
University
CogNova
Technologies