Download Evaluation in IR in context of the Web

Evaluating IR (Web) Systems • • • • • Study of Information Seeking & IR Pragmatics of IR experimentation The dynamic Web Cataloging & understanding Web docs Web site characteristics Study of Info seeking & retrieval - Well known authors (useful for research papers) • Real life studies (not TREC) - User context of questions Questions (structure & classification) Searcher (cognitive traits & decision making) Information Items • Difference searches with same question • Relevant items • “models, measures, methods, procedures and statistical analyses” p 175 • Beyond common sense and anecdotes Study 2 • Is there ever enough user research? • A good set of elements to include in an IR system evaluation • How do you test for real life situations? - Questions the users actually have Expertise in subject (or not) Intent User’s computers, desks & materials • What’s a search strategy? - Tactics, habits, previous knowledge • How do you collect search data? Study 3 • How do you ask questions? - General knowledge test - Specific search terms • Learning Style Inventory - NOT the best way to understand users - Better than nothing - Choose your questions like your users • Let users choose their questions? • Let users work together on searches • Effectiveness Measures - Recall, precision, relevance Study 4 • Measuring efficiency - Time on tasks - Task completion • Correct answer • Any answer? - Worthwhile? • Counting correct answers • Statistics - Clicks, commands, pages, results Not just computer time, but the overall process Start with the basics, then get advanced Regression analysis (dependencies for large studies) Let’s design an experiment • User Selection - Searcher (cognitive traits & decision making) - User context of questions • Environment • Questions (structure & classification) • Information Items - Successful answers - Successful/Worthwhile sessions • Measurement Pragmatics of IR experimentation • The entire IR evaluation must be planned • Controls are essential • Working with what you can get - Expert defined questions & answers - Specific systems • Fast, cheap, informal tests - Not always, but could be pre-tests - Quick results for broad findings Pragmatic Decision1 • Testing at all? - Purpose of test - Pull data from previous tests • Repeat old test - Old test with new system - Old test with new database • Same test, many users - Same system - Same questions (data) Pragmatic Decision 2 • What kind of test? • Everything at once? - System (help, no help?) - Users (types of) - Questions (open-ended?) • Facts - Answers with numbers - Words the user knows • General knowledge - Found more easily - Ambiguity goes both ways Pragmatic Decision 3 • • • • Understanding the Data What are your variables? (p 207) Working with initial goals of study Study size determines measurement methods - Lots of user - Many questions - All system features, competing system features • What is acceptable/passable performance? - Time, correct answers, clicks? - Which are controlled? Pragmatic Decision 4 • What database? - The Web (no control) - Smaller dataset (useful to user?) • Very similar questions, small dataset - Web site search vs. whole Web search - Prior knowledge of subject - Comprehensive survey of possible results beforehand • Differences other than content? Pragmatic Decision 5 • Where do queries/questions come from? - Content itself - User pre-interview (pre-tests) - Other studies • What are search terms (used or given) - Single terms - Advanced searching - Results quantity Pragmatic Decisions 6, 7, 8 • Analyzing queries - Scoring system - Logging use • What’s a winning query (treatment of units) - User success, expert answer - Time, performance - Different querie with same answer? • Collect the data - Logging and asking users - Consistency (software, questionnaires, scripts) Pragmatic Decisions 9 & 10 • Analyzing Data • Dependent on the dataset • Compare to other studies • Basic statistics first • Presenting Results • Work from plan • Purpose • Measurement • Models • Users • Matching other studies Keeping Up with the Changing Web • Building Indices is difficult enough in theory • What about a continuously changing huge volume of information? • Is old information good? • What does up-to-date mean anymore? • Is Knowledge a depreciating commodity? - Correctness + Value over time • Different information changes at different rates - Really it’s new information • How do you update an index with constantly changing information? Changing Web Properties • Known distributions for information change • Sites and pages may have easily identifiable patterns of update - 4% change on every observation - Some don’t ever change (links too) • If you check and a page hasn’t changed, what is the probability it will ever change? • Rate of change is related to rate of attention - Machines vs. Users - Measures can be compared along with information Dynamic Maint. of Indexes w/Landmarks • Web Crawlers do the work in gathering pages • Incremental crawling means incremented indices - Rebuild the whole index more frequently - Devise a scheme for updates (and deletions) - Use supplementary indices (i.e. date) • New documents • Changed documents • 404 documents Landmarks for Indexing • Difference-based method • Documents that don’t change are landmarks - Relative addressing - Clarke: block-based - Glimpse: chunking • Only update pointers to pages • Tags and document properties are landmarked • Broader pointers mean less updates • Faster indexing – Faster access? Yahoo! Cataloging the Web • How do information professionals build an “index” of the Web? • Cataloging applies to the Web • Indexing with synonyms • Browsing indexes vs searching them • Comprehensive index not the goal - Quality - Information Density • Yahoo’s own ontology – points to site for full info • Subject Trees with aliases (@) to other locations • “More like this” comparisons as checksums Yahoo uses tools for indexing Investigation of Documents from the WWW • What properties do Web documents have? • What structure and formats do Web documents use? • What properties do Web documents have? - Size – 4K avg. Tags – ratio and popular tags MIME types (file extensions) URL properties and formats Links – internal and external Graphics Readability WWW Documents Investigation • How do you collect data like this? - Web Crawler • URL identifier, link follower - Index-like processing • Markup parser, keyword identifier • Domain name translation (and caching) • How do these facts help with indexing? • Have general characteristics changed? • (This would be a great project to update.) Properties of Highly-Rated Web Sites • What about whole Web sites? • What is a Web site? - Sub-sites? - Specific contextual, subject-based parts of a Web site? - Links from other Web pages: on the site and off - Web site navigation effects • Will experts (like Yahoo catalogers) like a site? Properties • • • • • • Links & formatting Graphics – one, but not too many Text formatting – 9 pt. with normal style Page (layout) formatting – min. colors Page performance (size and acess) Site architecture (pages, nav elements) - More links within and external - Interactive (search boxes, menus) • Consistency within a site is key • How would a user or index builder make use of these? Extra Discussion • Little Words, Big Difference - The difference that makes a difference - Singular and plural noun identification can change indices and retrieval results - Language use differences • Decay and Failures - Dead links - Types of errors - Huge amount of dead links (PageRank effective) • 28% in 1995-1999 Computer & CACM • 41% in 2002 articles • Better than the average Web page? Break! Topic Discussions Set • Leading WIRED Topic Discussions - About 20 minutes reviewing issues from the week’s readings • Key ideas from the readings • Questions you have about the readings • Concepts from readings to expand on - PowerPoint slides - Handouts - Extra readings (at least a few days before class) – send to wired listserv Web IR Evaluation - 5 page written evaluation of a Web IR System - technology overview (how it works) • Not an eval of a standard search engine • Only main determinable diff is content - a brief overview of the development of this type of system (why it works better) - intended uses for the system (who, when, why) - (your) examples or case studies of the system in use and its overall effectiveness Projects and/or Papers Overview • How can (Web) IR be better? - Better IR models - Better User Interfaces • More to find vs. easier to find • Web documents sampling • Web cataloging work - Metadata & IR - Who watches the catalogers? • Scriptable applications - Using existing IR systems in new ways - RSS & IR Project Ideas • Searchable Personal Digital Library • Browser hacks for searching • Mozilla keeps all the pages you surf so you can search through them later - Mozilla hack - Local search engines • Keeping track of searches • Monitoring searches Paper Ideas • New datasets for IR • Search on the Desktop – issues, previous research and ideas • Collaborative searching – advantages and potential, but what about privacy? • Collaborative Filtering literature review • Open source and IR systems history & discussion

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Download Evaluation in IR in context of the Web