Adaptive Optimization of the Number of Clusters in Fuzzy Clustering

Association Rule Generation using Attribute Information Gain and

... Existing classification and rule learning algorithms in machine learning [16] mainly use heuristic/greedy search to find a subset of regularities (e.g., a decision tree or a set of rules) in data for classification[4][5]. In the past few years, extensive research was done in the database community o ...

Model-based Clustering With Probabilistic Constraints

Data analysis: an introduction

... •  Selec)on may involve choosing a subset of aFributes –  Dimensionality reduc)on is oden used to reduce the number of dimensions to two or three –  Alterna)vely, pairs of aFributes can be considered ...

An Error Detecting and Tagging Framework for Reducing Data Entry

... The Usher system is developed to detect errors on form entry fields by using Bayesian network and a graphical model with explicit error modeling[17]. The system includes a probabilistic error model based on a Bayesian network for estimating contextualized error likelihood for each field on a form. T ...

IBM Research Report A Condensation Approach to Privacy

... data problem such as classiﬁcation, clustering, or association rule mining, a new distribution based data mining algorithm needs to be developed. For example, the work in [1] develops a new distribution based data mining algorithm for the classiﬁcation problem, whereas the techniques in [9], and [1 ...

Data Mining

... Tan, Steinbach & Kumar (2005), “Introduction to data mining”. Addison Wesley. Theodoridis & Koutroumbas (2006), "Pattern recognition, 3nd ed". Academic Press. Therrien (1989), "Decision, estimation and classification". Wiley & Sons. ...

The Evolution of Business Intelligence: From Historical Data Mining

Using spatial data mining to discover the hidden rules in the crime

... and therefore it is not possible to use only the methods of classical data mining. It is about using of both data and spatial data mining methods. There are currently several methodologies for data mining which we can be used in many application fields. As an example we can mention the CRISP-DM meth ...

ppt - IDA.LiU.se

Data Mining: Concepts and Techniques

Full Text - International Journal of Computer Science and Network

... algorithm and hybrid AMPSO algorithm is applied on different benchmark datasets and find out that AMPSO hybrid algorithm is always found a better result than the standard PSO. It was also able to improve the results of the k-Nearest Neighbor algorithm [7]. ...

机器学习及统计分类器的参数性能评价研究(ijitcs-v5-n6-8)

... when the events are independent and Bayes is used for the bayes rule. This technique assumes that attributes of a class are independent in real life. The performance of the Naive Bayes is better when the data set has actual values. Kernel density estimators can be used to measure the probability in ...

Visualization What is visualization - UF CISE

7class - College of Computing & Informatics

c - Data Science Lab

... s times. Scientific Data Analysis - Jarek Szlichta ...

Domain Adaptation for Machine Translation by Mining Unseen Words Jagadeesh Jagarlamudi Abstract

... use context and orthographic features. In the second stage, using the dictionary probabilities of seen words, we identify pairs of words whose feature vectors are used to learn the CCA projection directions. In the final stage, we project all the words into the sub-space identified by CCA and mine t ...

Where the Rubber Meets the Sky

... You will recognize these people when you meet them – they are the ones with the jobs that take weeks or months to run their Python scripts. Their delay from question to answer is days or weeks. They are the ones who are doing batch processing on their data. They envy people who have interactive acce ...

State 2016 - West Virginia GIS Technical Center

High-performance Data Mining System

Exploratory Medical Knowledge Discovery : Experiences and Issues

QDrill: Query-Based Distributed Consumable

... Models require the full dataset available beforehand to do the training. Updatable Models are incremental models that can be trained using one instance (record) at a time. QDrill’s Analytics Adaptor uses two training approaches, one for each model type.  For Non-Updateable Models, Drill fetches the ...

Here

Ontological Assistance for Knowledge Discovery in Databases

... together with variations of their representations in XML (allowing information interchange with PMML DM models). It means that a concept described by an OWL class can have one or more related XML schemas that define its concrete representation in XML. In the DMO, for simplicity reasons, there are tw ...

Information Visualisation and Machine Learning

... representations, and information visualisation practitioners generally resort to processing or filtering the original data by hand. Generally speaking, scalability of visualisation techniques has been a long-standing issue in the field. Regarding the Visually enhanced Mining category, Section 3 show ...

< 1 ... 191 192 193 194 195 196 197 198 199 ... 505 >

Nonlinear dimensionality reduction

High-dimensional data, meaning data that requires more than two or three dimensions to represent, can be difficult to interpret. One approach to simplification is to assume that the data of interest lie on an embedded non-linear manifold within the higher-dimensional space. If the manifold is of low enough dimension, the data can be visualised in the low-dimensional space.Below is a summary of some of the important algorithms from the history of manifold learning and nonlinear dimensionality reduction (NLDR). Many of these non-linear dimensionality reduction methods are related to the linear methods listed below. Non-linear methods can be broadly classified into two groups: those that provide a mapping (either from the high-dimensional space to the low-dimensional embedding or vice versa), and those that just give a visualisation. In the context of machine learning, mapping methods may be viewed as a preliminary feature extraction step, after which pattern recognition algorithms are applied. Typically those that just give a visualisation are based on proximity data – that is, distance measurements.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Nonlinear dimensionality reduction