data mining
table of contents
Preface
1. What’s it all about?
1.1 Data Mining and Machine Learning
1.2 Simple Examples: The Weather Problem and Others
1.3 Fielded Applications
1.4 The Data Mining Process
1.5 Machine Learning and Statistics
1.6 Generalization as Search
1.7 Data Mining and Ethics
1.8 Further Reading and Bibliographic Notes
2. Input: concepts, instances, attributes
2.1 What’s a Concept?
2.2 What’s in an Example?
2.3 What’s in an Attribute?
2.4 Preparing the Input
2.5 Further Reading and Bibliographic Notes
3. Output: Knowledge representation
3.1 Tables
3.2 Linear Models
3.3 Trees
3.4 Rules
3.5 Instance-Based Representation
3.6 Clusters
3.7 Further Reading and Bibliographic Notes
4. Algorithms: the basic methods
4.1 Inferring Rudimentary Rules
4.2 Simple Probabilistic Modeling
4.3 Divide-and-Conquer: Constructing Decision Trees
4.4 Covering Algorithms: Constructing Rules
4.5 Mining Association Rules
4.6 Linear Models
4.7 Instance-Based Learning
4.8 Clustering
4.9 Multi-Instance Learning
4.10 Further Reading and Bibliographic Notes
4.11 WEKA Implementations
5. Credibility: Evaluating what’s been learned
5.1 Training and Testing
5.2 Predicting Performance
5.3 Cross-Validation
5.4 Other Estimates
5.5 Hyperparameter Selection
5.6 Comparing Data Mining Schemes
5.7 Predicting Probabilities
5.8 Counting the Cost
5.9 Evaluating Numeric Prediction
5.10 The Minimum Description Length Principle
5.11 Applying MDL to Clustering
5.12 Using a Validation Set for Model Selection
5.13 Further Reading and Bibliographic Notes
6. Trees and rules
6.1 Decision Trees
6.2 Classification Rules
6.3 Association Rules
6.4 WEKA Implementations
7. Extending instance-based and linear models
7.1 Instance-Based Learning
7.2 Extending Linear Models
7.3 Numeric Prediction with Local Linear Models
7.4 WEKA Implementations
8. Data transformations
8.1 Attribute Selection
8.2 Discretizing Numeric Attributes
8.3 Projections
8.4 Sampling
8.5 Cleansing
8.6 Transforming Multiple Classes to Binary Ones
8.7 Calibrating Class Probabilities
8.8 Further Reading and Biblographic Notes
8.9 WEKA Implementations
9. Probabilistic methods
9.1 Foundations
9.2 Bayesian Networks
9.3 Clustering and Probability Density Estimation
9.4 Hidden Variable Models
9.5 Bayesian Estimation and Prediction
9.6 Graphical Models and Factor Graphs
9.7 Conditional Probability Models
9.8 Sequential and Temporal Models
9.9 Further Reading and Bibliographic Notes
9.10 WEKA Implementations
10. Deep learning
10.1 Deep Feedforward Networks
10.2 Training and Evaluating Deep Networks
10.3 Convolutional Neural Networks
10.4 Autoencoders
10.5 Stochastic Deep Networks
10.6 Recurrent Neural Networks
10.7 Further Reading and Bibliographic Notes
10.8 Deep Learning Software and Network Implementations
10.9 WEKA implementations
11. Beyond supervised and unsupervised learning
11.1 Semi-supervised learning
11.2 Multi-instance Learning
11.3 Further Reading and Bibliographic Notes
11.4 WEKA Implementations
12. Ensemble Learning
12.1 Combining Multiple Models
12.2 Bagging
12.3 Randomization
12.4 Boosting
12.5 Additive Regression
12.6 Interpretable Ensembles
12.7 Stacking
12.8 Further Reading and Bibliographic Notes
12.9 WEKA Implementations
13. Moving on: Applications and Beyond
13.1 Applying Data Mining
13.2 Learning from Massive Datasets
13.3 Data Stream Learning
13.4 Incorporating Domain Knowledge
13.5 Text Mining
13.6 Web Mining
13.7 Images and Speech
13.8 Adversarial Situations
13.9 Ubiquitous Data Mining
13.10 Further Reading and Bibliographic Notes
13.11 WEKA Implementations
Appendix A: Theoretical foundations
Appendix B: The WEKA workbench
References
Index
The following wiki, pages and posts are tagged with
Title | Type | Excerpt |
---|---|---|
basic setup using mac's new gpu | post | Wed, Oct 20, 21, initial setup on mac machine |
Resources for DS & ML & DL | post | Mon, Oct 25, 21, bugs and tuts lists books services & api frameworks |
Data combining using pandas | post | Tue, Oct 26, 21, Coupling multiple dataframes together uisng dataFrame and series |
Dataset Collection for dl ml sources | post | Tue, Oct 26, 21, datasets for Scikt-learn, public google and nlp projects with awesome-public-datasets, Open Images V6 |
Practical Machine Learning Tools and Techniques | post | Tue, Dec 28, 21, owerpoint slides for Chapters 1-12. This is a very comprehensive teaching resource, with many PPT slides covering each chapter of the book |
meet-puppeteer.md | post | javascript로 브라우저 자동화 |
Machine learning, deep learning, AI | page | DL/ML concept google search model 𝗔𝗿𝘁𝗶𝗳𝗶𝗰𝗶𝗮𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 𝗣𝗿𝗼𝗷𝗲𝗰𝘁 𝗟𝗶𝘀𝘁 |
webscraping | page | webscraping lessons, rapa, blackyak, 100 famous mountains, github actions and python install |