Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Knowledge Discovery in the Digital Library Access tools for mining science ICSTI Public Workshop Presented by: Bernard Dumouchel, Director-General February 3, 2006 Overview • Knowledge Discovery – Linked-Literature Analysis – Main Path Analysis • Digital Libraries • Integrating access into research 2 Knowledge Discovery The process of transforming data into previously unknown or unsuspected relationships. (Trybula 1997) • Process for discovering and extracting new information: – Statistics – Pattern recognition – Machine learning – Visualization • Goal of knowledge discovery is to identify higher-level, more abstract relationships between texts. 3 Knowledge Discovery Data Mining Knowledge Discovery Measure Discrete quantities Relationships Expression Probabilities Interpretations • Computationally intensive • Augments human expertise: – Interactive, mental process 4 Linked Literature Analysis • Don Swanson – Specialization  Balkanization of science – “Undiscovered Public Knowledge” – Transitory links between disjoint concepts » » A ∩ C = Null ABC 5 Linked Literature Analysis Raynaud’s Disease Migraine Blood Viscosity Calcium Fish Oils Magnesium Somatomedin C HGH Arginine 6 Linked Literature Analysis • ARROWSMITH – Neil Smalheiser, MD, PhD – Interactive software that extends the power of the MEDLINE search – http://arrowsmith.psych.uic.edu/arrowsmith_uic/index.html • CISTI Research to generalize Linked Literature Analysis to other scientific domains 7 Main Path Analysis • A type of social network analysis • Citation = formal record of intellectual link • Citation network is a social network of science • Study of webs of relationships between seemingly disorganized items 8 Main Path Analysis • Norman Hummon & Patrick Doreian (1989) • Sequence of articles that best represent the development of a research field • Condenses web of relationships into a concise pathway 9 Main Path Analysis Time  10 Knowledge Discovery • Analyze relationships, interpret structure of science • Information is plentiful, knowing how it fits together is knowledge • Main Path Analysis, Linked Literature Analysis uncover meaningful relationships which suggest new knowledge 11 Roles of the Digital Library • Institutional repositories • Preservation of research data • Systems that make information useful Digital libraries are systems that make digital collections come alive, that make them usefully accessible, and that make them useful for accomplishing work. (Lynch, 2002) 12 Access for e-science • Access makes research easier • Access tools with analysis make research faster, more powerful • Digital Library’s challenge: develop and offer tools where research and access can be combined 13 Summary • Seamless access not just about convenience • Knowledge Discovery tools enables e-science to be a more integral part of research • Research libraries are the labs that make information useful 14 15