Survey							
                            
		                
		                * Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
http://www.ukoln.ac.uk/ A centre of expertise in digital information management Metadata-based Discovery: Experience in Crystallography Monica Duke m.duke@ukoln.ac.uk UKOLN, University of Bath, UK UKOLN is supported by: Discovery in the curation life-cycle “Digital Curation itself is the active management of data over the life-cycle of scholarly and scientific interest; it is the key to reproducibility and re-use. Metadata for resource discovery and retrieval are important, with mark-up on time/place referencing as well as subject description and linkage to discipline based ontologies providing key research foci.” Chris Rusbridge et al. http://www.dcc.ac.uk/docs/publications/DCC_Sardinia_paper_final.pdf Digital Library Infrastructures  Historically, cross-search and discovery protocols an area of interest and research  Z39.50 perceived to have barriers/limitations  OAI-PMH developed using a harvesting model  http://www.openarchives.org/ The OAI-PMH Data providers Harvesting based on OAI-PMH Service providers The OAI-PMH      OAI Protocol for Metadata Harvesting simple protocol for sharing metadata records between applications currently at version 2.0 based on HTTP, XML, XML Schema and XML namespaces allows a harvester to ask a remote repository for some or all of its metadata records  where ‘some’ is based on date-stamps, sets, metadata formats Metadata in the eBank UK project  Simple Dublin Core www.dublincore.org  Intended for resource discovery  Compatible with OAI-PMH  Qualified to specify ‘vocabularies’  Refinements: aid interpretation of element value  E.g. <dc:subject  “Dumbing-down” principle applies xml:lang="en">seafood</dc:subject> Metadata terms  Creator  Rights  Date  Type  Identifier  Subject InChI ChemicalFormula <dc:subject xsi:type="ebankterms:CompoundClass"> Organic</dc:subject> Specified using XML schema and documented using an Application Profile http://www.rdn.ac.uk/oai/ebank/20060310/ebank_dc.xsd http://www.ukoln.ac.uk/projects/ebank-uk/schemas/profile/ Information sources for Crystallography   Cross-discipline sources  OAIster  DAREnet Discipline-specific  ChemRefer  Chemistry Central  Crystallography Open Database  Reciprocal Net Texts/publications, chemistry general Data, crystallography The discovery landscape    Some within OAI-PMH infrastructure (metadatabased) Variety of (human) search interfaces (simple to advanced) Well established sources  Cambridge Structural Database  Protein Data Bank OAIster   An OAI-PMH aggregator Wide-ranging and inclusive: Any repository, all content types  Metadata from 675 institutions  Limit by resource type inc. datasets (5 results)  Pointers to collections of data  2000+ records for ‘crystallography’  Results spread across several sources OAIster http://www.oaister.org/ DAREnet   www.darenet.nl Worldwide access to Dutch academic research results  Simple search: “crystallography” (40 results)  General advanced search (author, year) DAREnet DAREnet ChemRefer    http://www.chemrefer.com Access to full text chemical, pharmaceutical literature Index Simple search interface ChemRefer ChemRefer display of results Chemistry Central  No search feature (through Biomed central) Crystallography Open Database (COD)  www.crystallography.net  Promotes open data  Allows submission  ‘REF’ format also used  40K entries COD Reciprocal Net  A distributed crystallography network for researchers, students and the general public  Search engine http://www.reciprocalnet.org/recipnet/search.jsp  Crystallography-specific search interface Reciprocal Net Search Interface Dataset result in Reciprocal Net Joining up the landscape  Technical infrastructure differences can be overcome  Agreement on common APIs, metadata sets  Hide API differences from user  Survey in one application area – how similar are other disciplines? Issues with cross-search   Audiences  Who are the user groups?  What are their information needs? Selection   Identifying subsets of interest Human Interface design  Search options  Presentation of heterogenous information