Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Discovering Evolutionary Theme Pattern from Text Qiaozhu Mei University of Illinois at Urbana-Champaign • Many textual collections bear some kinds of time stamps, which potentially suggests temporal patterns • Existing work in text mining has conceptually focused on one flat collection of text thus is inadequate for temporal text mining • We aim at developing methods for discovering evolutionary theme patterns from text Methodology • Modeling content evolution of themes: Research In US Aid from UN Extracting global significant themes Personalizad Experiences Aid for Children 7000 •Use KL Divergence Statistics Aid from the world 6000 5000 4000 3000 2000 1000 •HMM: • SIGIR Full-Texts from 1978 to 2004 Parameters can be estimated with an EM algorithm Lessons and Research inspired Aid and supports from the world DC in Japan Donation Events 1 29 b. Fe 26 n. Ja 23 n. Ja 20 n. Ja 17 Ja n. 14 n. ec . D D (III) Global Theme life cycles of Tsunami reports from CNN (IV) Global Theme life cycles of Tsunami reports from XINHUA Time Offsets(days) 0.02 Normalized Strength of Theme • KDD Abstracts from 1999 to 2004 ec . 24 0 •Fix the states and output probabilities • Tsunami News Data from 10 sources DC In Hong Kong Research 8000 •Evaluating Theme Transitions: •Train the transition probabilities with Baum-Welch algorithm, with the whole collection as example sequence Research In Japan 9000 •Assume that each document is generated by multiple themes • Experiments: Feb 08th • Example: Theme Life Cycles Ja Theme Extraction: A Mixture Model Jan 31st Political Criticism (II) Theme Evolution Graph and threads of Tsunami data set 11 t … Aid from U.S. n. Theme Life Cycles • Use a Probabilistic Mixture Model, estimating parameters with EM algorithm (each theme is a probability distribution, or unigram language model) Research In China Aid from Donation UK Match Donation Concerts (DC) in UK •Theme Extraction: s Jan 23th Personal Experience from Survivors New theme • Strength Change of themes: Jan 15th Statistics of death, Loss and damage Ja Collection with time stamps Jan 05th 8 …. Dec 24th n. … Theme threads Theme Evolution Graph θ3 • Example: Theme Evolution Graphs 27 … ending theme … … B Model theme shifts with HMM ec . 1,k … θ2 Partitioning n,1 3,1 … Theme sects extraction θ1 (I) Global Significant Themes in KDD Abstract Data Set D 1,2 2,1 Collection with time stamps t … Decoding Collection Theme Strength 1,1 T … Ja • Model Content evolution of themes: t 5 T … 3k Theme Life cycles n. …. 13 22 s Ja • Application: News, Literature, Email, Customer Review, etc Theme transitions 12 21 31 2 • Supporting navigation and inferences with time order 11 t n. • e.g. Revealing Research Trends Theme Evolution Graph Computing Strength and n. • Discovering of implicit temporal patterns • Example: Theme Representations Ja • Summarization of topics with temporal structure • Strength change of themes: 30 • What ETPs can help: Experiment Results Ja Evolutionary Theme Patterns Biology Data 0.018 Web Information 0.016 Time Series 0.014 Classification Association Rule 0.012 Clustering 0.01 Bussiness 0.008 0.006 0.004 0.002 0 1999 2000 2001 2002 Time (year) 2003 2004 (V) Global Themes life cycles of KDD Abstracts (VI) Global Theme life cycles of SIGIR full-texts