
Data Mining on Empty Result Queries
... detect such a query from the beginning in the DBMS, before any real query evaluation is executed. This will not only provide a quick answer, but it also reduces the load on a busy DBMS. Many data mining approaches deal with mining high density regions (eg: discovering cluster), or frequent data valu ...
... detect such a query from the beginning in the DBMS, before any real query evaluation is executed. This will not only provide a quick answer, but it also reduces the load on a busy DBMS. Many data mining approaches deal with mining high density regions (eg: discovering cluster), or frequent data valu ...
Visual Exploration of High-Dimensional Data: Subspace Analysis
... than the ambient dimensions. For example, the number of pixels in an image may be large. However, we typically use only a few parameters such as the geometry or the dynamics to describe the appearance. Data models inferred with such assumptions are often simple, in the number of parameters, and inte ...
... than the ambient dimensions. For example, the number of pixels in an image may be large. However, we typically use only a few parameters such as the geometry or the dynamics to describe the appearance. Data models inferred with such assumptions are often simple, in the number of parameters, and inte ...
distributed incremental data stream mining for wireless sensor
... Declaration “I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person (except where explicitly defined in the acknowledgements), nor material which to a substantial extent has bee ...
... Declaration “I hereby declare that this submission is my own work and that, to the best of my knowledge and belief, it contains no material previously published or written by another person (except where explicitly defined in the acknowledgements), nor material which to a substantial extent has bee ...
TOWARD ACCURATE AND EFFICIENT OUTLIER DETECTION IN
... large amounts of data every day. Data mining is the process of discovering relationships within data. The identified relationships can be used for scientific discovery, business decision making, or data profiling. Among data mining techniques, outlier detection plays an important role. Outlier detec ...
... large amounts of data every day. Data mining is the process of discovering relationships within data. The identified relationships can be used for scientific discovery, business decision making, or data profiling. Among data mining techniques, outlier detection plays an important role. Outlier detec ...
ROUGH SETS METHODS IN FEATURE REDUCTION AND
... where µ represents the total data mean and the determinant |Sb | denotes a scalar representation of the between-class scatter matrix, and similarly, the determinant |Sw | denotes a scalar representation of the within-class scatter matrix. Criteria based on minimum concept description. Based on the m ...
... where µ represents the total data mean and the determinant |Sb | denotes a scalar representation of the between-class scatter matrix, and similarly, the determinant |Sw | denotes a scalar representation of the within-class scatter matrix. Criteria based on minimum concept description. Based on the m ...
K - Department of Computer Science
... address the types of these algorithms, the way neighborhoods are calculated and the number of calculations involved. K-Means ...
... address the types of these algorithms, the way neighborhoods are calculated and the number of calculations involved. K-Means ...
Distributed and Stream Data Mining Algorithms for
... In several interesting application frameworks, such as wireless network analysis and fraud detection, data are naturally distributed among several entities and/or evolve continuously. In all of the above-indicated data mining tasks, dealing with either of these peculiarities provides additional chal ...
... In several interesting application frameworks, such as wireless network analysis and fraud detection, data are naturally distributed among several entities and/or evolve continuously. In all of the above-indicated data mining tasks, dealing with either of these peculiarities provides additional chal ...
Proceedings of the ECMLPKDD 2015 Doctoral Consortium
... The impetus for this work came from EMSAT Corporation, which specializes in real-time environment monitoring. With the aggregation and visualization components of their software already present, they were interested in further preprocessing and knowledge discovery in these data streams, in particula ...
... The impetus for this work came from EMSAT Corporation, which specializes in real-time environment monitoring. With the aggregation and visualization components of their software already present, they were interested in further preprocessing and knowledge discovery in these data streams, in particula ...
Mining Health Data for Breast Cancer Diagnosis Using Machine
... based on iterative k nearest neighbours and the distance functions. The approach is an iterative approach until finding the most suitable features values that satisfy classification accuracy. The proposed approach showed improvement of 0.005 of classification accuracy on the constructed dataset than ...
... based on iterative k nearest neighbours and the distance functions. The approach is an iterative approach until finding the most suitable features values that satisfy classification accuracy. The proposed approach showed improvement of 0.005 of classification accuracy on the constructed dataset than ...
Automatic Document Topic Identification Using Hierarchical
... around the world has led to a greatly increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. We introduce in this thesis a novel approach for identifying document topics. In ...
... around the world has led to a greatly increased need for machine understanding of their topics, as well as for automatic grouping of related documents. This constitutes one of the main current challenges in text mining. We introduce in this thesis a novel approach for identifying document topics. In ...
Mining Moving Object Data for Discovery of Animal Movement Patterns
... collected. Moving object data could be related to human, objects (e.g., airplanes, vehicles and ships), animals, and/or natural forces (e.g., hurricanes and tornadoes). Although most human and man-made object movements are closely associated with social and economic behaviors of people and society, ...
... collected. Moving object data could be related to human, objects (e.g., airplanes, vehicles and ships), animals, and/or natural forces (e.g., hurricanes and tornadoes). Although most human and man-made object movements are closely associated with social and economic behaviors of people and society, ...
Construction of Deterministic, Consistent, and Stable Explanations from Numerical Data and Prior Domain Knowledge
... two training sets A and B taken randomly from two populations A and B, respectively, are given. The attributes of the records may be numerical or nominal, and some entries may be missing and presumably cannot be obtained for various reasons. Possibly, partial prior domain knowledge is also given. We ...
... two training sets A and B taken randomly from two populations A and B, respectively, are given. The attributes of the records may be numerical or nominal, and some entries may be missing and presumably cannot be obtained for various reasons. Possibly, partial prior domain knowledge is also given. We ...
Impact of Evaluation Methods on Decision Tree Accuracy Batuhan
... Receiving large amount of data has given companies, governments and private people an opportunity to use these raw data and turn them into valuable information. For instance, companies have started improving their businesses by the help of data. Business intelligence (BI) and business analytics (BA) ...
... Receiving large amount of data has given companies, governments and private people an opportunity to use these raw data and turn them into valuable information. For instance, companies have started improving their businesses by the help of data. Business intelligence (BI) and business analytics (BA) ...
Comparative Analysis of Various Approaches Used in Frequent
... dataset, H-struct is not as efficient as FP-Tree because FP-Tree allows compression. E. Incremental Update with Apriori-based Algorithms Complete dataset is normally huge and the incremental portion is relatively small compared to the complete dataset. In many cases, it is not feasible to perform a ...
... dataset, H-struct is not as efficient as FP-Tree because FP-Tree allows compression. E. Incremental Update with Apriori-based Algorithms Complete dataset is normally huge and the incremental portion is relatively small compared to the complete dataset. In many cases, it is not feasible to perform a ...
DISC: Data-Intensive Similarity Measure for Categorical Data
... factors like co-occurrence statistics that can be effectively used to define what should be considered more similar and vice-versa. This observation has motivated researchers to come up with data-driven similarity measures for categorical attributes. Such measures take into account the frequency dis ...
... factors like co-occurrence statistics that can be effectively used to define what should be considered more similar and vice-versa. This observation has motivated researchers to come up with data-driven similarity measures for categorical attributes. Such measures take into account the frequency dis ...
Data Mining with Structure Adapting Neural Networks
... the shape of the network. The new features result in reducing the possibility of twisted maps and achieves convergence with localised self organisation. The localised processing and the optimised shape helps in generating representative maps with smaller number of nodes. The GSOM is also exible in ...
... the shape of the network. The new features result in reducing the possibility of twisted maps and achieves convergence with localised self organisation. The localised processing and the optimised shape helps in generating representative maps with smaller number of nodes. The GSOM is also exible in ...
Rule extraction using Recursive-Rule extraction algorithm with
... diabetes, accounts for about 5% of all diagnosed adult cases of diabetes. Although it can occur at any age, the peak age for diagnosis of type 1 diabetes is in the mid-teens. The peak age of onset of type 2 diabetes mellitus (T2DM), which was previously known as non–insulin-dependent diabetes mellit ...
... diabetes, accounts for about 5% of all diagnosed adult cases of diabetes. Although it can occur at any age, the peak age for diagnosis of type 1 diabetes is in the mid-teens. The peak age of onset of type 2 diabetes mellitus (T2DM), which was previously known as non–insulin-dependent diabetes mellit ...