
Cross-domain Text Classification using Wikipedia
... Cross-domain classification is related to transfer learning, where the knowledge acquired to accomplish a given task is used to tackle another learning task. In [28], the authors built a term covariance matrix using the auxiliary problem, to measure the co-occurrence between terms. The resulting ter ...
... Cross-domain classification is related to transfer learning, where the knowledge acquired to accomplish a given task is used to tackle another learning task. In [28], the authors built a term covariance matrix using the auxiliary problem, to measure the co-occurrence between terms. The resulting ter ...
Life-and-Death Problem Solver in Go
... dead or unsettled. Alive means that the surrounded group does not need to be defended because it cannot be killed, i.e. it is unnecessary (indeed pointless) to play a stone in the surrounded area to secure (or to attack) the surrounded group. Unsettled is a situation where, if the owner of the surro ...
... dead or unsettled. Alive means that the surrounded group does not need to be defended because it cannot be killed, i.e. it is unnecessary (indeed pointless) to play a stone in the surrounded area to secure (or to attack) the surrounded group. Unsettled is a situation where, if the owner of the surro ...
credit card fraud detection based on behavior mining
... K-means clustering algorithm Algorithm: k-means The k-means algorithm for partitioning where each cluster center Is represented by the mean value of the objects in the cluster. ...
... K-means clustering algorithm Algorithm: k-means The k-means algorithm for partitioning where each cluster center Is represented by the mean value of the objects in the cluster. ...
KNN Classification and Regression using SAS
... 1. It is simple to implement. Theoretically, kNN algorithm is very simple to implement. The naive version of the algorithm is easy to implement. For every data point in the test sample, directly computing the desired distances to all stored vectors, and choose those shortest k examples among stored ...
... 1. It is simple to implement. Theoretically, kNN algorithm is very simple to implement. The naive version of the algorithm is easy to implement. For every data point in the test sample, directly computing the desired distances to all stored vectors, and choose those shortest k examples among stored ...
Predicting Workers' Compensation Insurance Fraud Using SAS Enterprise Miner 5.1 and SAS Text Miner
... each case in a cluster. The smallest clusters should be examined first. Ideally, your organization will have domain experts who have some experience in fraud cases. Because the task of examining each case in each cluster can be overwhelming, you should have the domain experts describe cases that imp ...
... each case in a cluster. The smallest clusters should be examined first. Ideally, your organization will have domain experts who have some experience in fraud cases. Because the task of examining each case in each cluster can be overwhelming, you should have the domain experts describe cases that imp ...
Inferring taxonomic hierarchies from 0
... Complex structures in nature and in society are frequently modeled and managed with hierarchies. For engineers and scientists hierarchies are a tool used for abstraction and classification. For example, a software engineer uses hierarchical abstraction to build and manage complex computer programs. ...
... Complex structures in nature and in society are frequently modeled and managed with hierarchies. For engineers and scientists hierarchies are a tool used for abstraction and classification. For example, a software engineer uses hierarchical abstraction to build and manage complex computer programs. ...
OutRank: A GRAPH-BASED OUTLIER DETECTION FRAMEWORK
... According to Hawkins,6 outliers can be defined as follows: Definition 1. (Outlier) An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism. Most outlier detection schemes adopt Hawkin’s definition of outliers an ...
... According to Hawkins,6 outliers can be defined as follows: Definition 1. (Outlier) An outlier is an observation that deviates so much from other observations as to arouse suspicion that it was generated by a different mechanism. Most outlier detection schemes adopt Hawkin’s definition of outliers an ...
A Survey on Frequent Pattern Mining Methods
... a) Sharding: The database is divided into successive parts and stored on different computers. This distribution and division of data is called sharding. b) Parallel Counting: This step is to count the support values of all the items that appear in database. The input is one shard of database. In thi ...
... a) Sharding: The database is divided into successive parts and stored on different computers. This distribution and division of data is called sharding. b) Parallel Counting: This step is to count the support values of all the items that appear in database. The input is one shard of database. In thi ...
Massimo Poesio: Text Categorization and
... representation has aimed to describe knowledge independently of the personal and social context in which it is used, with the advantage that we can automate reasoning with such knowledge using mechanisms that also are context independent. This sounds good until you try it on a large scale and find o ...
... representation has aimed to describe knowledge independently of the personal and social context in which it is used, with the advantage that we can automate reasoning with such knowledge using mechanisms that also are context independent. This sounds good until you try it on a large scale and find o ...
A Three-Scan Mining Algorithm for High On
... Besides, many studies [1, 3, 5, 9] were proposed to dynamically mine association rules. An example for dynamical mining is to find the patterns for on-shelf products. However, a product may be put on shelf and taken off shelf multiple times in a store. If the entire database is considered for mining ...
... Besides, many studies [1, 3, 5, 9] were proposed to dynamically mine association rules. An example for dynamical mining is to find the patterns for on-shelf products. However, a product may be put on shelf and taken off shelf multiple times in a store. If the entire database is considered for mining ...
isda.softcomputing.net
... when this item appears in the transaction database to the partition when this item no longer exists [2]. That is, the exhibition period is the time duration when the item is available to be purchased. Hence, these works cannot be effectively applied to a temporal transaction database, such as a publ ...
... when this item appears in the transaction database to the partition when this item no longer exists [2]. That is, the exhibition period is the time duration when the item is available to be purchased. Hence, these works cannot be effectively applied to a temporal transaction database, such as a publ ...
A Rule-Based Classification Algorithm for Uncertain Data
... missing attribute values. However, the problem studied in this paper is different from before. Instead of assuming part of the data has missing or noisy values, we allow the whole dataset to be uncertain. Furthermore, the uncertainty is not shown as missing or erroneous values but represented as unc ...
... missing attribute values. However, the problem studied in this paper is different from before. Instead of assuming part of the data has missing or noisy values, we allow the whole dataset to be uncertain. Furthermore, the uncertainty is not shown as missing or erroneous values but represented as unc ...
A comparative study on principal component analysis and
... From the definition it can be mentioned that the support of an item is a statistical significance of an association rule. Suppose the support of an item is 0.1%, it means only 0.1 percent of the transaction contains purchasing of this item. The retailer will not pay much attention to such kind of it ...
... From the definition it can be mentioned that the support of an item is a statistical significance of an association rule. Suppose the support of an item is 0.1%, it means only 0.1 percent of the transaction contains purchasing of this item. The retailer will not pay much attention to such kind of it ...
IOSR Journal of Mathematics (IOSR-JM)
... The data to be compressed consist of N data vectors, from k -dimensions. Principal Component Analysis (PCA) searches for c k dimensional orthogonal vectors that can best be used to represent the data, where c k . The original data set are projected onto a much smaller space, resulting in data comp ...
... The data to be compressed consist of N data vectors, from k -dimensions. Principal Component Analysis (PCA) searches for c k dimensional orthogonal vectors that can best be used to represent the data, where c k . The original data set are projected onto a much smaller space, resulting in data comp ...
w - UTK-EECS
... • Select the set of variables such that the current iteration will make progress towards the minimum of W(α) – Use first order approximation, i.e., steepest direction d of descent which has only q non-zero elements ...
... • Select the set of variables such that the current iteration will make progress towards the minimum of W(α) – Use first order approximation, i.e., steepest direction d of descent which has only q non-zero elements ...
A Survey of Outlier Detection Methodologies.
... 1. Type 1 - Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points, and flags them as potential outliers. Type 1 assumes that e ...
... 1. Type 1 - Determine the outliers with no prior knowledge of the data. This is essentially a learning approach analogous to unsupervised clustering. The approach processes the data as a static distribution, pinpoints the most remote points, and flags them as potential outliers. Type 1 assumes that e ...
Workload-Aware Anonymization Techniques for Large
... on the quality of the resulting data. While much of the previous literature has measured quality through simple one-size-fits-all measures, we argue that quality is best judged with respect to the workload for which the data will ultimately be used. This article provides a suite of anonymization alg ...
... on the quality of the resulting data. While much of the previous literature has measured quality through simple one-size-fits-all measures, we argue that quality is best judged with respect to the workload for which the data will ultimately be used. This article provides a suite of anonymization alg ...