Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Complimentary eBook offered by ARTIFICIAL INTELLIGENCE FOR YOUR DAILY BUSINESS Understanding the Spectrum of AI-technologies for Review, Investigation and Contract Analytics in eDiscovery “ ARTIFICIAL INTELLIGENCE IS NO MATCH FOR NATURAL STUPIDITY” ( Anonymous ) Techniques from the world of Artificial Intelligence (AI) are rapidly finding their way into today’s business practices. They are being used to accelerate the speed and efficiency of an organization’s internal processes. The main reason for this success is that after several decades of research, AI techniques are not only allowing us to process enormous amounts of data 24/7 at staggering speed, but are also consistently performing on par with (and often even better and more consistently than) humans. This results in revolutionary productivity gains. Although it is clear there is high value in Artificial Intelligence for review, investigation, contract analytics, eDiscovery and other legal fact-finding missions, organizations still struggle to understand the different techniques involved. In this eBook we will explain different techniques and illustrate these with practical business-related examples. cept Search Technology Assisted Review Topic Modeling Big Data Analy Machine Learning Clustering Data Mining Predictive Coding Text Mining Natural Language Processing Machine Translation Audio Se AI - WHAT’S IN IT FOR ME? Machine Learning, Natural Language Processing (NLP) and similar techniques such as data- or text-mining, big data-analysis, predictive coding, Technology Assisted Review (TAR), concept search, topic modeling, clustering, audio search and machine translation are all Artificial Intelligence techniques and can be used to identify specific document categories and to search for relevant information in documents. These techniques are being used to enhance the speed and efficiency of eDiscovery practices and can also be used to accelerate other legal processes. cept Search Technology Assisted Review Topic Modeling Big Data Analy Machine Learning Clustering Data Mining Predictive Coding Text Mining Natural Language Processing Machine Translation Audio Se Machine Learning Natural Language Processing ining Natural Language Processing Machine Translation Aud Machine Learning Clustering Data Mining Predictive Coding earch Technology Assisted Review Topic Modeling Big Data Machine Learning is the process by which software recognizes patterns and relationships within large datasets. A classification system first learns using “training data”. New pieces of data are then classified based on the (latent) patterns that have been learnt in the training data. After sufficient training, the behavior of new data can be predicted, and it is even possible to distill information from previously unknown patterns and semantic relationships. ZyLAB’s Machine Learning uses the most advanced machine learning algorithms in combination with advanced statistical and semantic methods to represent the content of a document. Natural Language Processing refers to the ability of a computer program to understand spoken language. NLP is also based on Machine Learning and uses word processing techniques that treat text like a random sequence of symbols, but that also considers the hierarchical structure of language; words form a phrase, phrases make a sentence and sentences convey a message. Text Mining ining Natural Language Processing Machine Translation Aud Machine Learning Clustering Data Mining Predictive Coding earch Technology Assisted Review Topic Modeling Big Data Text Mining, also known as Text Analysis, refers to the use of varied techniques to automatically enrich data in large data volumes and then search for hidden patterns and relationships. Once identified, this data can be filtered, sorted, and visualized; and discovered topics and categories can be prioritized. Text mining identifies and highlights information from patterns and semantic relationships which were previously unknown. Technology Assisted Review ining Natural Language Processing Machine Translation Aud Machine Learning Clustering Data Mining Predictive Coding earch Technology Assisted Review Topic Modeling Big Data Technology Assisted Review (TAR), also known as Computer Assisted Review (CAR) or Predictive Coding, uses a series of algorithms to search and sort documents relevant for data investigation or eDiscovery. TAR also utilizes Machine Learning. ZyLAB uses a variety of methods for automatic document classification to support Technology Assisted Review (TAR). These patented methods vary from straightforward search-based, regular expressions and gazetteers (dictionaries), to advanced methods using NLP and Machine Learning. 100 % ZyLAB Machine Learning TAR Machine Learning for Automatic Document Classification RECALL OCR on Bitmaps, Visual Classification, Text-Mining, Audio Search & Machine Translation Search on Extracted Metadata (document properties, file properties, forensics) Fuzzy, Wildcard, Quorum, Proximity, Relevance Ranking Traditional Boolean Search 0% ZyLAB Rules-based TAR Topic Modeling / Clustering ining Natural Language Processing Machine Translation Aud Machine Learning Clustering Data Mining Predictive Coding earch Technology Assisted Review Topic Modeling Big Data Topic Modeling & Cluster Analysis Two other approaches to text mining. A topic model is used to statistically explore abstract concepts (topics) that occur within a set of documents. Cluster analysis uses perceived relationships between various groups of objects to create new sub-groups (clusters). These documents are ideal for use with Machine Learning. COMBINING DIFFERENT APPROACHES The advantage of full-text search and text-mining techniques are that they are transparent, and that every contract lawyer knows how to use full-text search and how to combine different search techniques. The problem of an incorrectly classified document can be fixed by the lawyer simply changing the query. The effort of writing queries can be combined in libraries of full-text queries, which can be shared and re-used. The queries can also be easily translated into other languages. This is not always the case when using Machine Learning, which is more of a black box that either works or not and, in the latter case, is hard to fix. Furthermore, Machine Learning is not transparent enough for users to directly understand why a document is classified into a specific category. Because Machine Learning uses specific document sets for “training data”, the learn patterns are not always relevant for documents that differ too much from them. As each technique clearly has its own advantages and disadvantages, it is best to allow the user to combine the different methods to achieve the highest possible recall and precision. This is exactly what ZyLAB does: it starts with simple, straightforward and transparent techniques and expands into more advanced methods when needed. PRACTICAL USE CASE AI IN LITIGATION & ARBITRATION (EDISCOVERY) ZyLAB eDiscovery is a complete end-to-end solution for all your discovery and regulatory needs. AI-techniques are used for: •Automatic identification of relevant documents for litigation and arbitration (eDiscovery) using sample documents; •Automatic clustering and classification of documents into relevant groups and sub-groups; •Searching the content of images and videos without the need to add textual descriptions; •Automated machine translation technology to quickly translate all information up front: this can then be tagged and reviewed in ZyLAB’s highly intuitive review platform. This way relevant data is quickly uncovered and critical information can be routed for specialized human translation if needed. cept Search Technology Assisted Review Topic Modeling Big Data Analy Machine Learning Clustering Data Mining Predictive Coding Text Mining Natural Language Processing Machine Translation Audio Se PRACTICAL USE CASE MERGERS & ACQUISITIONS (M&A) AND LARGE CORPORATE TRANSACTIONS: AI FOR CONTRACT DISCOVERY, REVIEW AND ANALYSIS Many organizations keep track of their agreements and other relevant documents in a contract management system. Next to monitoring deadlines, notice periods, warranties and guarantees, these systems are also used to generate documents used to fill a data room with the relevant documents. ZyLAB’s eDiscovery technology helps to identify contracts from live data locations such as email boxes, SharePoint or file shares. During processing, all documents are analyzed for additional metadata, specific content, email threads, duplicates, privileged information and much more. The outcome of this process can be used to generate documents used to fill a data room with the relevant documents. Get better insight in your data without having to search and review the actual data itself. Text analysis helps you find entities such as organizations, persons and more. Code words and other patterns like sentiments, requests and travel activities can be extracted and can guide you straight to the relevant information. PRACTICAL USE CASE AI FOR LEGAL FACT FINDING, FRAUD AND INTERNAL INVESTIGATIONS Legal fact finding is key in all data investigations, whether conducted in relation to a crime, an internal fraud case or a request for disclosure of government documents. ZyLAB’s own indexing engine can index up to TBs of data per day and supports access to over 750 different file formats. ZyLAB has been a leader in legal and investigative full-text search since 1983, offering not only industry-standard search functionality, but also unique operations such as our fast and world-famous fuzzy, quorum, wildcard, proximity, phrase and regular expression searches. In addition, ZyLAB allows users to search numeric ranges, dates and file names, and to use text delimiters to define key fields and text ranges on the fly. These extensive search capabilities, combined with our fast multi-threaded and distributed indexes, help in finding relevant information faster than any other tool on the market. Hits from your search are highlighted on every document, even if these were originally image based. cept Search Technology Assisted Review Topic Modeling Big Data Analy Machine Learning Clustering Data Mining Predictive Coding Text Mining Natural Language Processing Machine Translation Audio Se PRACTICAL USE CASE - AI FOR REDACTION FOR DATA PROTECTION Identification of any personal data which must be deleted, redacted or anonymized. 1 2 3 The Automated Redaction Process Unique pseudonyms Identified names can also be replaced by a unique pseudonym. This way the Personally Identifiable Information (PII) is redacted and protected, but the relationship between the persons or companies is maintained by the pseudonyms. Reviewers can review or adjust the automatic redactions by using sampling or manual review. PRACTICAL USE CASE AI FOR FOIA AND PUBLIC RECORDS DISCLOSURES As the number of information requests has increased exponentially over the past years, organizations worldwide can no longer process all information requests in time. When handling public records requests, there are many possible levels of automation which can optimize the process, making it possible to use resources more effectively and to deal with ever increasing data volumes. ZyLAB implements automation for collection, processing, deduplication, data enrichment, translation, categorization, data visualization, disclosure cost reporting, keyword hit highlighting, search and tagging, audio and video search, Vaughn Index Creation and bulk redaction. ZyLAB is positioned as a “leader” in Gartner’s “2015 Magic Quadrant for eDiscovery Software”, ranked #1 for complete EDRM eDiscovery in the analysts’ “Critical Capabilities for E-Discovery Software 2015” report and has received numerous other industry accolades over the last 3 decades. For over 30 years, ZyLAB has worked with professionals in the litigation, auditing, security and intelligence communities to develop the most advanced solutions for investigating and managing large sets of information. Our solution is used by Fortune 1000 companies, government agencies, courts and law firms.