Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Graph-RAT Overview By Daniel McEnnis What is Graph-RAT  Relational Analysis Toolkit  Database abstraction layer  Evaluation platform  Robustly evaluate all different ways of performing recommendation 2/32 Kinds of Analysis     Recommendation Systems Data Mining Relational Machine Learning MIR document retrieval 3/32 Talk Outline       Base Components Queries Algorithms Schedulers Graph-RAT Language Conclusion and Examples 4/32 Base Components  Graphs  Actors  Links  Properties Name John E Library [Vector] A A Age B Hobbies C Hiking Biking 22 D 5/32 Properties      Variables of Graph-RAT Can be arbitrary Java types Can be attached to anything Unique ID string for each object Accessed only as sets, not as objects 6/32 Data View  Hyper-graph structure defined by the set of actors and links in a graph  Accessible from the enclosing graph  Can be cyclic A B E C D 7/32 Metadata View  Not constructed by default  Implicit graph described by modes and the relations between them  Needed for relational machine learning User Friend 8/32 Query Language     Constructs sets retrieved from a graph Functional structure Similar to SQL 4 types     Graph Queries Actor Queries Link Queries Property Queries 9/32 Query Structure  Cascading queries in a LISP style syntax  Each child query is of a different type  Restrictions can be added at runtime 10/32 Query Examples  LinkByActor(     false, ActorByMode(false, “Target”,”.*”) ActorByMode(false, “Source”,”.*”) SetOperation.XOR) 11/32 Query Comparisons  Similar to the JENA interface  Construction is similar to Jung system  Implements all SQL queries that do not require temporary tables 12/32 0.4.3 Query  Uses graph primitives instead of Queries  Algorithms use hard-coded GraphByID 13/32 Algorithms  Functions that execute over a given graph  Metadata is a part of the algorithm  Excepting output algorithms, no side effects are permitted. execute(Graph graph)  Properties utilized or created are declared up front. IODescriptor getInput() IODescriptor getOuput() 14/32 Propositional Algorithms  Utilizes aggregator function as a parameter  Crosses all ways of shifting data       Aggregate By Link Aggregate By Link Property Aggregate On Graph Graph To Actor Link To Graph Graph To Graph 15/32 Aggregator Functions  1 or more elements to equal or fewer elements  Examples • • • • Statistical Moments Arithmetic Operations Null Aggregation Concatentation 16/32 Social Network Analysis Algorithms  Prestige Algorithms      Degree Betweeness Closeness Page Rank HITS  Graph Triples 17/32 Classification Algorithms  Machine Learning Primitives  Uses Weka  Separate algorithms for training and classifying 18/32 Clustering Algorithms  Several graph-based algorithms     Weak Component Clustering Strong Component Clustering Edge Betweeness Clustering Norman-Girvan Edge Betweeness  Also has primitives calling Weka on vector data 19/32 Similarity Algorithms  Comparisons between modes  Types of Similarity • • • Similarity By Link Similarity By Property Graph Similarity  Distance Functions • • • All Weka distance functions KLDistance Exponential Distance 20/32 Collaborative Filtering Algorithms  Traditional recommendation algorithms  Item to Item  User to User  Associative Mining 21/32 Array-Based Algorithms  Transform To Array  Principal Component Analysis 22/32 Evaluation  All forms of evaluating results  Set Based (precision and recall)  Weighted Set (Correlations)  Ordered Lists (Kendall Tau, Half Life)  Cross-Validation algorithms  By Actor  By Link  By Graph 23/32 Data Acquisition  Components for acquiring source data  File Reader Types  Reading different file formats  Web Crawling Types  LiveJournal or LastFM  Connection Types  Links different sets together 24/32 Web Crawler  Custom Multi-threaded web crawler  Dynamic parsers  Properties passing between both crawls and parser execution  Stop and filter conditions are parameterized 25/32 Existing Parsers  Base HTML parsing  XML Parsing (SAX)     LiveJournal FOAF LastFM REST services Graph-RAT documents Yahoo search queries 26/32 Comparisons      SQL LINQ Matlab Other graph packages Prolog? 27/32 Embedded Use  Dynamic Loading  AbstractFactory abstract superclass  Example - Retrieving links to YouTube videos from GData 28/32 Graph-RAT Language  Base Graph-RAT:  Data Acquisition components executed  For each algorithm entry:  Graph Query selects a set of graphs  Algorithm is executed over each graph  Cross-Validation Graph-RAT  Mode, relation, or graph chosen in advance,  Data Acquisition components run once  Algorithm entries rerun for each fold  Statistical Graph-RAT  List of cross-validation schedulers  Statistical metrics of which performed better 29/32 User To User Collaborative Filtering Example     Aggregate By Link(Artist->User) Similarity By Link (User->User) Aggregate By Link (User->User) Property to Link (User->Artist) 30/32 Setup Example 31/32 Setup Example <Scheduler class=“BasicScheduler”> <Graph> <MemGraph/> </Graph> … </Scheduler> DataAquisition <DataAcquisition> <Class>Crawl LastFM</Class> <Name>Crawl LastFM</Name> <MemGraph/> <Property><Name>Proxy</Name> <Value>proxy.waikato.ac.nz</Value> </Property> … </DataAquisition> 33/32 Query Entry <Algorithm> <Query> <GraphByID> <Pattern>.*</Pattern> </GraphByID> </Query> </Algorithm> 34/32 Algorithm Entry <Algorithm> <Query>…</Query> <Class>GraphTriples</Class> <Name>Graph Triples</Name> <Property><Name>Relation</Name> <Value>Friends</Value> </Property> <Property><Name>Destination</Name> <Value>TriplesVector</Value> </Property> … </Algorithm> Future Work      Stabilization - 0.5.1 to beta Statistical testing on result sets Upgrading the GUI interface Memory performance upgrades Octave Integration 36/32 Questions?  http://graph-rat.sourceforge.net  Stable (beta) release is 0.4.3 37/32