May 1, 2024
Updated June 25, 2025
19 minute read
Understanding Missing Values: A Comprehensive Guide
Missing values, in the context of data, refer to the absence of information for one or more variables in an observation. Imagine a survey where some respondents skip questions, or a sensor that occasionally fails to record a measurement; these are instances of missing values. While seemingly innocuous, missing data can be a significant hurdle in any data-driven endeavor. Understanding how to identify, interpret, and handle these gaps is a fundamental skill in many analytical professions.
fchd1q|
Find a path to becoming a Missing Values. Learn more at:
OpenCourser.com/topic/fchd1q/missing
Reading list
We've selected 26 books
that we think will supplement your
learning. Use these to
develop background knowledge, enrich your coursework, and gain a
deeper understanding of the topics covered in
Missing Values.
Provides a comprehensive overview of multiple imputation, a powerful method for handling missing data.
Classic work on multiple imputation, a powerful method for handling missing data.
Provides a comprehensive overview of missing data, covering topics such as missing data mechanisms, imputation methods, and sensitivity analysis.
Is considered a foundational text in the field of missing data analysis. It provides a rigorous theoretical framework and covers a wide range of methods, including likelihood-based approaches and multiple imputation. It's an essential reference for anyone seeking a deep understanding of the statistical principles behind handling missing data and is often used as a textbook in graduate-level statistics programs.
Offers a practical guide to multiple imputation, a widely used technique for handling missing data. It emphasizes the application of methods using the R package MICE, making it particularly useful for those working in R. The book balances theoretical concepts with detailed worked examples, making it accessible to both statisticians and applied researchers.
Provides a user-friendly and comprehensive overview of modern missing data procedures with a focus on applied research. It translates technical literature into accessible guidelines and covers topics like maximum likelihood estimation, Bayesian estimation, and multiple imputation. It's a valuable resource for graduate students and researchers in various disciplines.
Focuses specifically on multiple imputation and its applications. It provides a detailed account of the method, covering both theoretical and practical aspects. It good resource for those who want to deepen their understanding of multiple imputation beyond introductory texts.
Provides a comprehensive overview of pattern mixture models, a powerful method for handling missing data.
Provides a practical guide to missing data analysis and treatment.
Provides a comprehensive overview of missing data in longitudinal studies, covering topics such as missing data mechanisms, imputation methods, and sensitivity analysis.
Provides a comprehensive overview of missing data in epidemiologic studies, covering topics such as missing data mechanisms, imputation methods, and sensitivity analysis.
This classic text by the originator of multiple imputation provides a thorough introduction to the theory and application of this important technique, particularly in the context of survey data. While focused on surveys, the core concepts are broadly applicable to missing data problems in many fields. It's a foundational text for understanding the principles of multiple imputation.
Provides a readable account of likelihood-based and Bayesian approaches to analyzing datasets with missing multivariate data. It covers similar topics to Little and Rubin but with a slightly different emphasis. It's a valuable resource for understanding these foundational statistical methods for incomplete data.
Specifically addresses the challenges of missing data in clinical trials. It provides guidance and methods relevant to this specific domain, making it essential for statisticians and researchers working in clinical research. It covers various approaches tailored to clinical trial designs.
Delves into the theoretical underpinnings of handling missing data using semiparametric models. It more advanced text suitable for researchers and statisticians with a strong theoretical background. It covers topics like inverse probability weighting and generalized estimating equations.
This concise book provides a clear and sensible introduction to handling missing data, particularly for researchers in the social sciences. It covers key concepts and methods in an accessible manner, making it a good starting point for those new to the topic. It's a valuable resource for applied researchers.
This foundational text on Bayesian data analysis includes discussions on handling missing data from a Bayesian perspective. It covers how missing values can be incorporated into Bayesian models and inference. Essential for those interested in Bayesian approaches to missing data.
This cookbook provides practical recipes for cleaning and handling data using Python libraries like pandas. It includes techniques specifically for dealing with missing values within a Python environment. This valuable resource for those focusing on implementing missing data handling in Python.
While not exclusively focused on missing values, this book provides a comprehensive overview of the data cleaning process, which is where handling missing data often begins. It covers various error detection and repair methods, offering a practical perspective on preparing data for analysis. is useful for gaining background knowledge in the broader context of data quality.
This comprehensive book on regression and multilevel modeling discusses handling missing data within these modeling frameworks. It offers practical advice and examples, particularly relevant for those applying these statistical models to data with missing values. A valuable reference for applied data analysts.
Similar to 'R for Data Science,' this handbook covers data science in Python and includes a section on handling missing data using pandas and NumPy. It explains how missing data is represented and provides practical tools for detection, removal, and imputation in Python. A good resource for those working with Python.
Offers a systematic overview of best practices in data cleaning, including addressing missing data. It's geared towards researchers in computational social sciences but provides valuable, easily-implemented suggestions for improving data quality. It's a good resource for understanding the practical steps involved in preparing data.
While a broader book on longitudinal data analysis, this text includes important considerations for handling missing data within longitudinal contexts. It provides practical guidance and examples for analyzing data collected over time, which often presents unique missing data challenges. Useful supplementary reading for those working with longitudinal datasets.
While a general book on data science in R, this text includes sections on handling missing values using tidyverse packages like tidyr and dplyr. It provides a practical introduction to identifying and addressing missing data within a modern R workflow. Useful for those learning data analysis in R.
For more information about how these books relate to this course, visit:
OpenCourser.com/topic/fchd1q/missing