A Structured Approach to Data Mining

Image

A Structured Approach to Data Mining offers a systematic introduction to data mining for students, researchers, and practitioners in computing and information systems. It emphasises conceptual understanding, methodological reasoning, and appropriate application of techniques rather than software-specific implementation.

Organised around the main phases of data mining—problem understanding, data understanding, modelling, and evaluation— the book explains each stage’s purpose, key decisions, pitfalls, and logical links.


With core topics and advanced concepts such as model assumptions and modern learning architectures, the book serves as a primary text or reference for principled data mining practice.

List of Figures
List of Tables
Foreword
Preface
Introduction

  1. Introduction
    The Roots of Data Mining
    Setting the Context: The Boundaries of Data Mining
    Distinguishing Artificial Intelligence, Data Mining, Machine Learning, and Deep Learning
    Data Mining Process and Its Core Phases
    The Concept of “Model” in Data Mining
    Structure of This Book
  2. Problem Understanding
    Overview
    Stakeholder Involvement and Domain Knowledge Integration
    Objective and Problem Formulation
    Critical Questions in Problem Understanding
    The ProFUMe Methodology
    Conclusion
  3. Data Understanding
    Overview
    How Data Understanding Impacts Subsequent
    Data Mining Phases
    Data Understanding Through EDA
    Data Structure
    Data Distribution
    Data Quality
    Conclusion
  4. Data Preprocessing
    Overview
    Importance of Data Preprocessing
    Core Tasks of Data Preprocessing
    Methods for Handling Missing Values
    Methods for Addressing Outliers
    Methods for Handling Inconsistent Data
    Methods for Solving Irrelevant Data
    Methods for Addressing Duplicate Data
    Methods for Addressing Imbalanced Data
    Encoding Methods for Addressing Machine Learning
    Algorithm Suitability
    Conclusion
  5. Data Modelling (Introduction)
    Overview
    Supervised Learning Approach
    Unsupervised Learning Approach
    Unsupervised Learning Tasks: Clustering and Association Rule Mining
    Semi-Supervised Learning
    Comparison of Supervised, Unsupervised, and Semi-Supervised Approaches
    Machine Learning Approaches, Techniques, and Algorithms
    Periodicity of Data Modelling
    Conclusion
  6. Data Modelling (Decision Trees)
    Overview
    Types of Decision Trees
    Fundamental Concepts of Decision Trees
    Decision Tree Construction Algorithms
    Decision Tree Algorithms and Their Considerations
    Strengths and Limitations of Decision Trees
    Conclusion
  7. Data Modelling (Regressions)
    Overview
    Fundamental Concepts of Regressions
    Types of Regression Techniques
    Examples with Datasets for Different Regression Techniques
    Variable Selection in Regression Models
    Regression Assumptions in Statistical and Machine Learning
    Data Modelling
    Considerations for Regression Algorithm Selection
    Conclusion
  8. Data Modelling (Neural Networks)
    Overview
    Neural Networks as the Foundation of Deep Learning
    Fundamental Architecture and Concepts
    Types of Neural Networks
    Neural Network Architectures in Supervised and Unsupervised Learning
    Conclusion
  9. Data Modelling (Clustering)
    Overview
    Fundamental Concepts of Clustering
    Data Points Assignments
    Types of Clustering Techniques
    Overview of Clustering Algorithms
    Determining the Number of Clusters
    Applying Clustering Concepts with Example Dataset
    Feature Scaling in Clustering
    Assessing Clustering Model Quality
    Considerations in Selecting Clustering Techniques
    Conclusion
  10. Data Modelling (Association Rules)
    Overview
    Fundamental Concepts of Association Rules
    Types of Association Rules
    Considerations in Selecting Association Rules Algorithms
    Conclusion
  11. Data Modelling (Ensemble Models)
    Overview
    Fundamental Concepts of Ensemble Models
    Architectures of Ensemble Models
    Existing Algorithms and Implementations for Ensemble Models
    Considerations in Selecting an Ensemble Architectural Strategy
    Conclusion
  12. Model Evaluation
    Overview
    Model Fit and Generalisation
    Model Performance
    Model Complexity and Interpretability
    Sampling Techniques for Model Validation
    Feature Importance in Model Evaluation
    Efficiency
    Robustness
    Additional Considerations for Model Deployment
    Conclusion
  13. Data Mining Ethics and Emerging Considerations
    Overview
    Ethical Foundations in Data Mining
    Ethical Data Mining Lifecycle
    Ethics Guidelines, Policies and Law
    Evolving Ethics in Data Mining
    Ethical Challenges in Data Mining for Generative AI
    Conclusion

Exercises
References
Index

Chua Hui Na is Professor at Sunway University, Malaysia, specialising in data mining, applied machine learning, and responsible AI. With extensive industry experience in data engineering and analytical systems development, she is involved in nationally funded and industry-driven research and development initiatives that have led to multiple intellectual property and copyright filings.

Basic Information

Author(s):
Chua Hui Na
ISBN:
  • 978-629-7646-57-2 (Paperback)
Edition:
1
Publication Year:
2026
Imprint:
Sunway University Press
Pages:
416
Binding:
Paperback
Dimensions:
153 mm x 229 mm