* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Collaborative Filtering
Survey
Document related concepts
Transcript
Collaborative Data Analysis
and Multi-Agent Systems
Robert W. Thomas
CSCE 824
15 APR 2013
Agenda
•
•
•
•
Problem Description
Existing Research Overview
Limitation of Existing Results
Future Research Suggestions
2
Problem Description
• Information Overload
• Divide and Conquer; Reconcile
• Recommender Systems and Social Media
– Content Filtering
– Collaborative Filtering
– Collaborative Data Analysis through Agents
3
Content Filtering
• Recommendations based on items similar to
what has been preferred previously
4
Collaborative Filtering (CF)
• Recommendations based on what others in a
network prefer
• Different Techniques
– Memory-Based
– Model-Based
– Hybrid
5
Memory-Based CF
• Similarity Computation
• Prediction and Recommendation Computation
• Top-N Recommendations
6
Similarity Computation
• Compares Users or Items
• Correlation-Based (Pearson correlation)
• 𝑊𝑢,𝑣 =
• 𝑊𝑖,𝑗 =
𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 )(𝑟𝑣,𝑖 −𝑟𝑣 )
𝑖∈𝐼(𝑟𝑢,𝑖 −𝑟𝑢 )
2
2
𝑖∈𝐼(𝑟𝑣,𝑖 −𝑟𝑣 )
𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 )(𝑟𝑢,𝑗 −𝑟𝑗 )
2
𝑢∈𝑈(𝑟𝑢,𝑖 −𝑟𝑖 )
• Vector Cosine-Based
• 𝑊𝑖,𝑗 = cos 𝑖, 𝑗 =
𝑖∙𝑗
𝑖 ∗ 𝑗
2
𝑢∈𝑈(𝑟𝑢,𝑗 −𝑟𝑗 )
Two users: u,v
Two items: i,j
𝑖 ∈ 𝐼= items both u and v have
rated
𝑟𝑢 = avg rating of co-rated
items of the 𝑢𝑡ℎ user
𝑢 ∈ 𝑈= users who rated both i
and j
𝑟𝑖 = avg rating of the 𝑖 𝑡ℎ item
by those users
R = m x n user-item matrix
𝑖, 𝑗 are n dimensional vectors
corresponding to i and j
column of R
7
Prediction and Recommendation
Computation
• Weighted Sum of Others’ Ratings
– 𝑃𝑎,𝑖 = 𝑟𝑎 +
𝑢∈𝑈(
𝑟𝑢,𝑖 −𝑟𝑢 𝑤𝑎,𝑢 )
𝑢∈𝑈
𝑤
• Simple Weighted Average
– 𝑃𝑢,𝑖 =
𝑛∈𝑁 𝑟𝑢,𝑛 𝑤𝑖,𝑛
𝑛∈𝑁
𝑤𝑖,𝑛
Prediction P for active user a,
on item i
𝑟𝑢 = avg rating of user u
𝑤𝑎,𝑢 = weight between user a
and user u
𝑢 ∈ 𝑈= users who rated item i
Prediction P for user u on item i
𝑛 ∈ 𝑁= all other rated items
for user u
𝑤𝑖,𝑛 = weight between items i
and n
𝑟𝑢,𝑛 = rating for user u on item n
8
Top-N Recommendations
• Item-Based
• User-Based
9
Model-Based CF
•
•
•
•
•
Bayesian Belief Net
Clustering
Regression-Based
Markov Decision Process (MDP) –Based
Latent Semantic
10
Bayesian Belief Net
• Bayesian logic – decision making and inferential statistics
• Simple Bayesian
– Memory-Based
– 𝑐𝑙𝑎𝑠𝑠 = arg
max
𝑗∈𝑐𝑙𝑎𝑠𝑠𝑆𝑒𝑡
𝑝(𝑐𝑙𝑎𝑠𝑠𝑗 )
𝑜 𝑃(𝑋𝑜
= 𝑥𝑜 |𝑐𝑙𝑎𝑠𝑠𝑗 )
– Laplace Estimator to avoid a conditional probability of 0
– 𝑃 𝑋𝑖 = 𝑥𝑖 | 𝑌 = 𝑦 =
#(𝑋𝑖 =𝑥𝑖 ,𝑌=𝑦)+1
#(𝑌=𝑦)+ 𝑋𝑖
• Tree Augmented naïve Bayes and naïve Bayes optimized by
Extended Logic Regression (ELR)
– Require extended training periods to produce results beyond
simple Bayesian and Pearson correlation
11
Clustering
• Cluster: collection of similar objects, dissimilar
to objects in other clusters
– Pearson correlation can be used
• Three Categories
– Partitioning
– Density-based
– Hierarchal
• Often an Intermediate Step
12
Regression-Based
• Use approximation of ratings to make
predictions against a regression model
• Apply to situations where rating vectors have
large Euclidean distances but very high
Similarity Computation scores
13
MDP-Based
• Sequential Optimization Problem
• <S,A,R,Pr>
– S = {states}
– A = {actions}
– R = {rewards} for r(s,a,s’)
– Pr = {transition probabilities} for pr(s,a,s’)
• Partially Observable MDP (POMDP)
14
Latent Semantic
• Uses statistical modeling to discover
additional communities or profiles
15
Network Trust
• We’re all mad here; I’m mad; you’re mad.
• Opinions of different contacts are valued more
than others under certain conditions
• Accounting for this can increase CF accuracy
• Semantic Knowledge
• Social Tie-Strength
16
Hybrid CF
• CF + Content-Based
• CF + CF
• CF + CF and/or Content-Based
17
Limitations of Existing Solutions
•
•
•
•
•
•
•
•
Time / Accuracy Trade Offs
Noisy Data
Data Sparsity (New User)
Scalability
Synonymy
Gray Sheep
Shilling Attacks
Privacy
18
Future Research Suggestions
•
•
•
•
Hybrids
Semantics
Trust
Parallel Processing
– Multi-Agent Systems
19
BACKUP
20
References
• Su, Xiaoyuan, and Taghi M. Khoshgoftaar. "A survey of
collaborative filtering techniques." Advances in
Artificial Intelligence 2009 (2009): 4.
• Chen, Wei, and Simon Fong. "Social network
collaborative filtering framework and online trust
factors: a case study on Facebook." Digital Information
Management (ICDIM), 2010 Fifth International
Conference on. IEEE, 2010.
• O'Donovan, John, and Barry Smyth. "Trust in
recommender systems." Proceedings of the 10th
international conference on Intelligent user interfaces.
ACM, 2005.
21