# Exploratory Data Analysis for Complex Models

@article{Gelman2004ExploratoryDA, title={Exploratory Data Analysis for Complex Models}, author={Andrew Gelman}, journal={Journal of Computational and Graphical Statistics}, year={2004}, volume={13}, pages={755 - 779} }

“Exploratory” and “confirmatory” data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. For example, many of Tukey's methods can be interpreted as checks against hypothetical linear models and Poisson distributions. In more complex situations, Bayesian methods can be useful for constructing reference distributions for various plots that are useful in exploratory data analysis. This article proposes an… Expand

#### Topics from this paper

#### 182 Citations

Exploratory Data Analysis

- Computer Science
- 2012

The philosophical justification for EDA is presented in terms of C.S. Pierce's concept of abduction and the recognition of a broad range of analytic needs that arise throughout the research process. Expand

Designing for Interactive Exploratory Data Analysis Requires Theories of Graphical Inference

- Computer Science
- Harvard Data Science Review
- 2021

It is described how without a grounding in theories of human statistical inference, research in exploratory visual analysis can lead to contradictory interface objectives and representations of uncertainty that can discourage users from drawing valid inferences. Expand

Visualization in Bayesian Data Analysis

- Computer Science
- 2008

Modern Bayesian statistical science commonly proceeds without reference to statistical graphics; both involve computation, but they are rarely considered to be connected. Traditional views about the… Expand

Statistical inference for exploratory data analysis and model diagnostics

- Medicine, Biology
- Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
- 2009

The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent, and teachers might find that incorporating these protocols into the curriculum improves their students’ statistical thinking. Expand

Exploratory Data Analysis

- Computer Science
- 2010

Key components of EDA are the combination of statistical models and graphics and the incorporation of domain knowledge, and interactive graphical tools are particularly valuable for exploratory work. Expand

Validation of Visual Statistical Inference , with Application to Linear Models

- 2012

Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statistical… Expand

Visual Statistical Inference for Regression Parameters

- 2010

Statistical graphics play a crucial role in exploratory data analysis, model checking and diagnosis. Until recently there were no formal visual methods in place for determining statistical… Expand

Getting the most from your curves: Exploring and reporting data using informative graphical techniques

- Computer Science
- 2009

The role of exploratory data analysis in detecting Type I and Type II errors is considered and it is proposed that essential summary statistics and information about the shape and variability of data should be reported via graphical techniques. Expand

Visualizing Count Data Regressions Using Rootograms

- Mathematics
- 2016

ABSTRACT The rootogram is a graphical tool associated with the work of J. W. Tukey that was originally used for assessing goodness of fit of univariate distributions. Here, we extend the rootogram to… Expand

Exploratory Data Analysis using Random Forests ∗

- 2015

Although the rise of "big data" has made machine learning algorithms more visible and relevant for social scientists, they are still widely considered to be "black box" models that are not well… Expand

#### References

SHOWING 1-10 OF 107 REFERENCES

Graphical Methods for Assessing Logistic Regression Models

- Mathematics
- 1984

Abstract In ordinary linear regression, graphical diagnostic displays can be very useful for detecting and examining anomalous features in the fit of a model to data. For logistic regression models,… Expand

Two graphical displays for outlying and influential observations in regression

- Mathematics
- 1981

SUMMARY The paper describes two procedures for detecting observations with outlying values either in the response variable or in the explanatory variables in multiple regression. These procedures are… Expand

Probability plotting methods for the analysis of data.

- Mathematics, Medicine
- Biometrika
- 1968

SUMMARY This paper describes and discusses graphical techniques, based on the primitive empirical cumulative distribution function and on quantile (Q-Q) plots, percent (P-P) plots and hybrids of… Expand

Multiple imputation for model checking: completed-data plots with missing and latent data.

- Computer Science, Medicine
- Biometrics
- 2005

The methods of missing-data model checking can be interpreted as "predictive inference" in a non-Bayesian context and the graphical diagnostics within this framework are considered. Expand

Diagnostic checks for discrete data regression models using posterior predictive simulations

- Mathematics
- 2000

Model checking with discrete data regressions can be difficult because the usual methods such as residual plots have complicated reference distributions that depend on the parameters in the model.… Expand

Graphical Methods for Data Analysis

- Computer Science
- 1983

This paper presents a meta-modelling framework for developing and assessing regression models for multivariate and multi-dimensional data distributions and describes the distribution of a set of data. Expand

Models, assumptions and model checking in ecological regressions

- Computer Science, Mathematics
- 2001

Ecological regression is based on assumptions that are untestable from aggregate data. However, these assumptions seem more questionable in some applications than in others. There has been some… Expand

Statistical Computing and Graphics Let's Practice What We Preach: Turning Tables into Graphs

- Computer Science
- 2002

It is shown how it is possible to improve the presentations using graphs that actually take up less space than the original tables, with a particularly effective tool to be multiple repeated line plots. Expand

Posterior Predictive $p$-Values

- Mathematics
- 1994

Extending work of Rubin, this paper explores a Bayesian counterpart of the classical $p$-value, namely, a tail-area probability of a "test statistic" under a null hypothesis. The Bayesian… Expand

Bayesian Data Analysis

- Computer Science, Mathematics
- 1995

Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided. Expand