* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Download Ensemble methods with Data stream
Survey
Document related concepts
Transcript
Ensemble methods with
Data Streams
Jungbeom Lee
CS240B
Outline
Intro
Ensemble in Machine learning
Online ensemble algorithms
Future work
Intro
Previous class: Data Streams Classifiers
Ensemble methods
Online algorithm
Classifiers
•
The batch classification problem:
– Given a finite training set D={(x,y)} , where y={y1, y2, …, yk}, |D|=n, find
a function y=f(x) that can predict the y value for an unseen instance x
•
The data stream classification problem:
– Given an infinite sequence of pairs of the form (x,y) where y={y1, y2, …,
yk}, find a function y=f(x) that can predict the y value for an unseen
instance x
•
Example applications:
– Fraud detection in credit card transactions
– Topic classification in a news aggregation site, e.g. Google news
– Translator for foreign languages
Motivations
• Online mining different from static mining
Data Volume
◦ impossible to mine the entire data at one time
◦ can only afford constant memory per data sample
Changing data characteristics
◦ previously learned models are invalid
Cost of Learning
◦ model updates can be costly
◦ can only afford constant time per data sample.
Ensemble
A set of classifiers whose individual
decisions are combined in some way to
classify new examples
An ensemble of classifiers to be more
accurate than any of its individual
members
one key to successful is to use individual
classifiers with error rates below .5
Reasons
Ensemble methods
Manipulating the Training Examples
◦ Bagging
◦ Adaboost
Injecting Randomness
◦ C4.5 decision tree algorithm
Bagging algorithm
Bagging algorithm
Online bagging algorithm
Online weighted bagging algorithm
AdaBoost algorithm
AdaBoost algorithm
Adaptive boosting algorithm
Experimental Results
Type of Data
Experimental Results
Experimental Results
Experimental Results
Future work
Better online algorithm for Bagging
Dealing with multiple data types
References
http://web.engr.oregonstate.edu/~tgd/publications
/mcs-ensembles.pdf
http://pages.bangor.ac.uk/~mas00a/papers/lkSUEM
A2008.pdf
http://web.cs.ucla.edu/~zaniolo/papers/NBCAJM
W77MW0J8CP.pdf
https://ti.arc.nasa.gov/m/pubarchive/archive/0962.pdf
https://engineering.purdue.edu/~givan/papers/bp.p
df
http://hanj.cs.illinois.edu/pdf/kdd03_emsemble.pdf