A Structured Approach to Data Mining offers a systematic introduction to data mining for students, researchers, and practitioners in computing and information systems. It emphasises conceptual understanding, methodological reasoning, and appropriate application of techniques rather than software-specific implementation.
Organised around the main phases of data mining—problem understanding, data understanding, modelling, and evaluation— the book explains each stage’s purpose, key decisions, pitfalls, and logical links.
With core topics and advanced concepts such as model assumptions and modern learning architectures, the book serves as a primary text or reference for principled data mining practice.
List of Figures
List of Tables
Foreword
Preface
Introduction
- Introduction
The Roots of Data Mining
Setting the Context: The Boundaries of Data Mining
Distinguishing Artificial Intelligence, Data Mining, Machine Learning, and Deep Learning
Data Mining Process and Its Core Phases
The Concept of “Model” in Data Mining
Structure of This Book - Problem Understanding
Overview
Stakeholder Involvement and Domain Knowledge Integration
Objective and Problem Formulation
Critical Questions in Problem Understanding
The ProFUMe Methodology
Conclusion - Data Understanding
Overview
How Data Understanding Impacts Subsequent
Data Mining Phases
Data Understanding Through EDA
Data Structure
Data Distribution
Data Quality
Conclusion - Data Preprocessing
Overview
Importance of Data Preprocessing
Core Tasks of Data Preprocessing
Methods for Handling Missing Values
Methods for Addressing Outliers
Methods for Handling Inconsistent Data
Methods for Solving Irrelevant Data
Methods for Addressing Duplicate Data
Methods for Addressing Imbalanced Data
Encoding Methods for Addressing Machine Learning
Algorithm Suitability
Conclusion - Data Modelling (Introduction)
Overview
Supervised Learning Approach
Unsupervised Learning Approach
Unsupervised Learning Tasks: Clustering and Association Rule Mining
Semi-Supervised Learning
Comparison of Supervised, Unsupervised, and Semi-Supervised Approaches
Machine Learning Approaches, Techniques, and Algorithms
Periodicity of Data Modelling
Conclusion - Data Modelling (Decision Trees)
Overview
Types of Decision Trees
Fundamental Concepts of Decision Trees
Decision Tree Construction Algorithms
Decision Tree Algorithms and Their Considerations
Strengths and Limitations of Decision Trees
Conclusion - Data Modelling (Regressions)
Overview
Fundamental Concepts of Regressions
Types of Regression Techniques
Examples with Datasets for Different Regression Techniques
Variable Selection in Regression Models
Regression Assumptions in Statistical and Machine Learning
Data Modelling
Considerations for Regression Algorithm Selection
Conclusion - Data Modelling (Neural Networks)
Overview
Neural Networks as the Foundation of Deep Learning
Fundamental Architecture and Concepts
Types of Neural Networks
Neural Network Architectures in Supervised and Unsupervised Learning
Conclusion - Data Modelling (Clustering)
Overview
Fundamental Concepts of Clustering
Data Points Assignments
Types of Clustering Techniques
Overview of Clustering Algorithms
Determining the Number of Clusters
Applying Clustering Concepts with Example Dataset
Feature Scaling in Clustering
Assessing Clustering Model Quality
Considerations in Selecting Clustering Techniques
Conclusion - Data Modelling (Association Rules)
Overview
Fundamental Concepts of Association Rules
Types of Association Rules
Considerations in Selecting Association Rules Algorithms
Conclusion - Data Modelling (Ensemble Models)
Overview
Fundamental Concepts of Ensemble Models
Architectures of Ensemble Models
Existing Algorithms and Implementations for Ensemble Models
Considerations in Selecting an Ensemble Architectural Strategy
Conclusion - Model Evaluation
Overview
Model Fit and Generalisation
Model Performance
Model Complexity and Interpretability
Sampling Techniques for Model Validation
Feature Importance in Model Evaluation
Efficiency
Robustness
Additional Considerations for Model Deployment
Conclusion - Data Mining Ethics and Emerging Considerations
Overview
Ethical Foundations in Data Mining
Ethical Data Mining Lifecycle
Ethics Guidelines, Policies and Law
Evolving Ethics in Data Mining
Ethical Challenges in Data Mining for Generative AI
Conclusion
Exercises
References
Index
Chua Hui Na is Professor at Sunway University, Malaysia, specialising in data mining, applied machine learning, and responsible AI. With extensive industry experience in data engineering and analytical systems development, she is involved in nationally funded and industry-driven research and development initiatives that have led to multiple intellectual property and copyright filings.
Basic Information
- 978-629-7646-57-2 (Paperback)

