Survey
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
國立雲林科技大學 National Yunlin University of Science and Technology Multiobjective Clustering with Automatic k-determination for Large-scale Data Presenter : Shao-Wei Cheng Authors : Nobukazu Matake, Tomoyuki Hiroyasu, Mitsunori Miki, Tomoharu Senda CECCO 2007 Intelligent Database Systems Lab Outline Motivation Objective Methodology Original MOCK New scalable k-determination scheme Experiments and Results Conclusion Personal Comments N.Y.U.S.T. I. M. 2 Intelligent Database Systems Lab Motivation N.Y.U.S.T. I. M. Web behavior mining has attracted a great deal of attention today. MOCK is powerful and strict. But the computational costs are too high when applied to clustering huge data. Too Much Data !! 3 Intelligent Database Systems Lab Objectives Apply MOCK to web data clustering with a scalable automatic k-determination scheme. Determine the appropriate k at low cost. N.Y.U.S.T. I. M. It contains two complementary objectives. Determination of appropriate k. Find partitions between k clusters. 4 Intelligent Database Systems Lab Methodology Original MOCK N.Y.U.S.T. I. M. Third Step First Step Forth Step Second Step Gap statistic 5 Intelligent Database Systems Lab Methodology N.Y.U.S.T. I. M. New scalable k-determination scheme First Step Second Step First scheme:Calculate adjacent angles x y Second scheme x x 6 Intelligent Database Systems Lab Experiments N.Y.U.S.T. I. M. 7 Intelligent Database Systems Lab Conclusion N.Y.U.S.T. I. M. The new scheme is able to determine the appropriate k at low cost, although the performance is poorer than the original algorithm. Reduce the Pareto size by about 50-70%. Doesn’t need random data clustering. 8 Intelligent Database Systems Lab Personal Comments N.Y.U.S.T. I. M. Advantage MOCK can be applied to large-scale data. Drawback Application Web data. 9 Intelligent Database Systems Lab