Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Post-processing Decision Trees to Extract Actionable Knowledge Qiang Yang and Jie Yin HKUST, Hong Kong China and Charles X. Ling and Tielin Chen Department of Computer Science University of Western Ontario, Canada CRM Customer Relationship Management: focus on customer satisfaction to improve profit Two kinds of CRM  Enabling CRM: Infrastructure, multiple touch point management, data integration and management, …   Oracle, IBM, PeopleSoft, Siebel Systems, … Intelligent CRM: data mining and analysis, database marketing, customization  Vendors/products (see later) From Data Mining to Actions  What to do to help Sammy to get loan approval? Applicants Customer Database Sammy Beatrice Dylan Mathew Larry Basil Income 50K 50K 80K 30K 40K 80K Married n y n n n n Cars 1 1 2 1 0 1 Approved? ? Yes Yes No No Yes •Action 1 (Dylan): get higher income to 80K • Action 2 (Beatrice): get Married!!  Actionable vs. Passive Data Mining Improve customer relationship   What actions to take to change customers from an undesired status to a desired one        Actions (promotion, communication)  changes From churn to loyal From inactive to active From low spending to high spending From non-buyers to buyers … and still make a profit (the ultimate goal) Approach: Post-processing Decision Trees  Mining actions from decision trees    Bounded action problem Bounded segment problem Our solutions Post-processing Decision Trees 1. 2. 3. 4. Get Customer Data (marketing DB) Build Customer Profiles Search Actions for Maximal Profit Action Delivery Step 1: Get Customer Data Marketing DB: Segmentation, data preparation, pre-processing… Define a “target”: undesired status and desired status ID Name Age Sex Service Rate Prof … Retained (Target) 1001 John 50 M H L A … Yes 3010 Sue 25 F M H D … No … … … … … … … … 40 M M L B … ??? … 1112 Jack Step 2: Build Customer Profile on target Automatically by Proactive Solution with probabilities on the target Service M H L Sex F Rate M L H Prob=0.1 Prob=0.9 Prob = 0.2 Prob=0.8 Prob=0.5 Step 3: Search Actions for Maximal Profit Proactive Solution searches more desired nodes in the profile… ID Name Age Sex Service Rate Prof … Retained … … … … … … … … … 40 M M L B … ??? 1112 Jack Jack: …, Service = M, Sex = M, … Profit =$4000 Service M H L Sex F Rate M L Prob=0.1 H Prob gain = -0.1 Serv: MH E.Profit= -400 Rate: ?L Cost= $500 E.Net Profit= -900 Prob=0.9 Prob gain = 0.7 E Profit= $2800 Cost =  E Net Profit= -  Prob = 0.2 Prob=0.8 Prob gain = 0.6 E Profit=$2400 E.Profit=$2400 Cost=$800 E NetProfit=$1600 E.NetProfit=$1600 Prob=0.5 Prob gain = 0.3 E Profit=$1200 Cost=$400 E NetProfit=$800 Step 4: Action Deployment ID Name Prob Actions diff 1112 Jack … 0.6 3010 Sue 0.5 3421 Bill … Action costs Service: M  H $800 Rate: L  M SigAcc: 0  1 $500 Service: L  M N/A NetProfit … $1600 … $700 $0 • Selective deployment: human intelligence, … • Customer segmentation by actions Practical Issue: Resource is Bounded  Limited number of account managers   Thus, the number of customer segments is bounded Research: how to generate no more than K customer segments, such that    We call this the bounded segmentation problem (BSP) Limited number of marketing actions    for each segment, find a set of common actions to apply Thus, types of actions are limited We call this the Bounded Attribute Set Problem (BASP) Both problems are NP-hard. The Bounded Segmentation Problem    Resources are bounded! Group (potential) negative-class customers into prespecified k customer segments. Recommend “near optimal” actions to help each of the k customer segments switch to a more profitable positive class.    Each segment is applied by the same actions (same manager) The expected net profit is to be maximized  Each action may have a different cost and bring different profits The Bounded Segmentation Problem is NPComplete  Equivalent to maximum coverage problem.   NP-hard problem! We seek approximate solutions! The Bounded Segmentation Problem: Greedy Algorithm 1. Discover who are negative-class customers.  2. build decision tree as the classifier Group negative-class leaf nodes into k customer segments using greedy algorithm.    Each customer segment  one action set The total profit gain by applying such k action sets can be maximized. Algorithm is based on finding the current largest coverage in linear time An Example: K=2 Service L H Status A  Rate B C D L1 L2 L3 L4 0.9 0.2 0.8 0.5 If we want to find two customer segments (k=2)   It is more profitable to transform L2L1 and L4L3 than others Profit gain = (0.9-0.2)*1-0.2 + (0.8-0.5)*1-0.1=0.7. cost Experiment on Mutual Fund Data   GreedyBSP can find k customer segments with maximal profit. Result is very close to those found by OptimalBSP. GreedyBSP is more scalable than OptimalBSP. Summary From decision-tree model building to extracting actions for profit  Goal: maximal net profit  Resource is bounded  Design optimization solutions for action extraction  BASP and BSP   Future: explore more efficient solutions