內容介紹
《數據挖掘導論》(英文版)對數據挖掘進行了全面介紹,旨在為讀者提供將數據挖掘套用於實際問題所必需的知識。《數據挖掘導論》(英文版)涵蓋五個主題:數據、分類、關聯分析、聚類和異常檢測。除異常檢測外,每個主題都有兩章:前面一章講述基本概念、代表性算法和評估技術,而後面一章較深入地討論高級概念和算法。目的是在使讀者透徹地理解數據挖掘基礎的同時,還能了解更多重要的高級主題。此外,書中還提供了大量例子、圖表和習題。作品目錄
1 Introduction1.1 What Is Data Mining?
1.2 Motivating Challenges
1.3 The Origins of Data Mining
1.4 Data Mining Tasks
1.5 Scope and Organization of the Book
1.6 Bibliographic Notes
1.7 Exercises
2 Data
2.1 Types of Data
2.1.1 Attributes and Measurement
2.1.2 Types of Data Sets
2.2 Data Quality
2.2.1 Measurement and Data Collection Issues
2.2.2 Issues Related to Applications
2.3 Data Preprocessing
2.3.1 Aggregation
2.3.2 Sampling
2.3.3 Dimensionality Reduction
2.3.4 Feature Subset Selection
2.3.5 Feature Creation
2.3.6 Discretization and Binarization
2.3.7 Variable Transformation
2.4 Measures of Similarity and Dissimilarity
2.4.1 Basics
2.4.2 Similarity and Dissimilarity between Simple Attributes
2.4.3 Dissimilarities between Data Objects
2.4.4 Similarities between Data Objects
2.4.5 Examples of Proximity Measures
2.4.6 Issues in Proximity Calculation
2.4.7 Selecting the Right Proximity Measure
2.5 Bibliographic Notes
2.6 Exercises
……………………………………………