Analytical Customer Relationship Management in Retailing

Solving Complex Machine Learning Problems with Ensemble Methods

... methods before learning component classifiers or embedding the cost-sensitive framework in the ensemble learning process, see their review in [5]. Although several specialized ensembles have been presented as adequate to class imbalance, there is still lack of their general comparison or discussion ...

Third-Generation Data Mining: Towards Service

doc - Dr. Richard Frost

... properties such as frequencies and time stamp. It also shows if sequences belong to same session. The last phase, the posteriori phase, filters based on the navigation template and network topology. This is then followed by pruning hits that, though have required support threshold, are not maximal. ...

Machine Learning Based Data Pre-processing for

... communication with many people, who have provided valuable input to my studies. First of all, I would like to thank my supervisor Dr. Darryl N Davis for giving in valuable feedback and advice during my work. His engagement and knowledge have inspired me a lot. I would also like to thank Dr. Chandras ...

Intelligent Miner for Data Applications Guide

Here - Advanced Computing Group home page

Supervised Local Pattern Discovery

... recruiting practices (Bay and Pazzani, 2001). Rather than trying to build a model which could be used to predict the behavior of a certain example, local patterns have been used to describe examples and build knowledge with the experts that work in student recruiting or marketing or other branches, ...

ICDM06.metaclust.caruana.pdf

... is described by a vector of features, and each dimension in the vector is a feature that will be used when calculating the similarity of points for clustering. By weighting features before distances are calculated (i.e. multiplying feature values by particular scalars), we can control the importance ...

Mining frequent item sets without candidate generation using

... search space were proposed. The experimental results reported in showed that CLOSET is faster than CHARM and A-close. CLOSET was extended to CLOSET+ by Wang et al. in to find the best strategies for mining frequent closed item sets. CLOSET+ uses data structures and data traversal strategies that dep ...

A Improved Incremental and Interactive Frequent Pattern Mining

... Objectives: To develop a memory efficient, incremental and interactive distributed FPM having less communication and synchronization overhead with good load balancing capability, to analyze the dynamic transactional data in a distributed database. Methods/Analysis: This technique adopts prefix based ...

Extracting Temporal Patterns from Interval-Based Sequences

... intervals associated to events in patterns. Experiments on simulated data show that our algorithm is efﬁcient for extracting precise patterns even in noisy contexts and that it improves the performance of a former algorithm which used a clustering method based on the EM algorithm. ...

1.5. Frequent sequence mining in data streams

Data Mining using Genetic Programming

Multi-threaded Implementation of Association Rule Mining with

... Most Associated Sequential Pattern (MASP) is a name given to the variant of the association rule mining algorithm [7]. This approach is used to find the most associated sequential patterns and also generate the datasets that contain the transactions. These transactions can be further mined to find t ...

EHRs - Medical informatics at Mayo Clinic

... • Mission: To enable the use of EHR data for secondary purposes, such as clinical research and public health. Leveraging clinical and health informatics to: • generate new knowledge • improve care ...

Boris Mirkin Clustering: A Data Recovery Approach

... invented with virtually no support to a non-specialist user for choosing among them. The trouble with this is that different similarity measures and/or clustering techniques may, and frequently do, lead to different results. Moreover, the same technique may also lead to different cluster solutions d ...

Privacy Protection Methods for Documents and Risk Evaluation for Microdata Daniel Abril Castellano

data stream mining - Department of Computer Science

K - Department of Computer Science

Spatial Data Mining

Parallel Itemset Mining in Massively Distributed Environments

... We propose ODPR (Optimal Data-Process Relationship) our solution for fast mining of frequent itemsets in MapReduce. Our method allows discovering itemsets from massive data sets, where standard solutions from the literature do not scale. Indeed, in a massively distributed environment, the arrangemen ...

Here - IEEE SSCI 2015

... Tuesday, December 8, 15:20–17:20 . . CICA’15 Session 2 . . . . . . . . CIVTS’15 Session 2 . . . . . . . CICS’15 Session 2 . . . . . . . . Wednesday, December 9, 09:50–12:20 CIBD’15 Session 1 . . . . . . . . ADPRL’15 Session 1 . . . . . . . IntECS’15 Session 1 . . . . . . . CIASG’15 Session 1 . . . . ...

evolving biologically inspired trading algorithms - BADA

... One group of information systems that have attracted a lot of attention during the past decade are financial information systems, especially systems pertaining to financial markets and electronic trading. Delivering accurate and timely information to traders substantially increases their chances of ...

Frequent pattern analysis for decision making in big data

< 1 ... 4 5 6 7 8 9 10 11 12 ... 169 >

K-means clustering

k-means clustering is a method of vector quantization, originally from signal processing, that is popular for cluster analysis in data mining. k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.The problem is computationally difficult (NP-hard); however, there are efficient heuristic algorithms that are commonly employed and converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both algorithms. Additionally, they both use cluster centers to model the data; however, k-means clustering tends to find clusters of comparable spatial extent, while the expectation-maximization mechanism allows clusters to have different shapes.The algorithm has a loose relationship to the k-nearest neighbor classifier, a popular machine learning technique for classification that is often confused with k-means because of the k in the name. One can apply the 1-nearest neighbor classifier on the cluster centers obtained by k-means to classify new data into the existing clusters. This is known as nearest centroid classifier or Rocchio algorithm.

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

Top subcategories

K-means clustering