Subtopic Deep Dive
Model Selection Criteria
Research Guide
What is Model Selection Criteria?
Model selection criteria are statistical methods for choosing the optimal model from a set of candidate models by balancing goodness-of-fit and model complexity to avoid overfitting.
Key criteria include Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and cross-validation techniques. Recent advances refine these for high-dimensional data using Lasso, stability selection, and penalized regression (Efron et al., 2004; 9367 citations; Meinshausen and Bühlmann, 2006; 2410 citations). Over 50,000 papers cite foundational works like least angle regression.
Why It Matters
Model selection criteria ensure generalizable predictions in genomics, econometrics, and climate modeling by preventing overfitting in high-dimensional datasets (Molinaro et al., 2005; 1300 citations). Breiman's two cultures framework highlights their role in bridging parametric and algorithmic modeling for robust inference (Breiman, 2001; 4081 citations). In big data analysis, they address variable selection challenges, enabling scalable inference (Fan et al., 2014; 1405 citations).
Key Research Challenges
High-dimensional bias
Standard Lasso introduces selection bias in high dimensions, hindering consistent variable selection (Zhang, 2010; 3844 citations). Minimax concave penalty (MC+) reduces this bias while maintaining computational speed. Finite-sample performance remains inconsistent across regimes.
Smoothing parameter choice
Selecting optimal smoothing parameters in semiparametric models pits REML against GCV, with theoretical preferences varying by data structure (Wood, 2010; 7179 citations). Restricted maximum likelihood provides stability but increases computation. Asymptotic validity needs empirical validation.
Model uncertainty quantification
Ignoring model uncertainty leads to overconfident inference; Bayesian model averaging addresses this by weighting models (Hoeting et al., 1999; 4104 citations). Computational cost scales poorly with candidate models. Integration with algorithmic approaches remains unresolved (Breiman, 2001).
Essential Papers
Least angle regression
Bradley Efron, Trevor Hastie, Iain M. Johnstone et al. · 2004 · The Annals of Statistics · 9.4K citations
The purpose of model selection algorithms such as All Subsets, Forward Selection and Backward Elimination is to choose a linear model on the basis of the same set of data to which the model will be...
Fast Stable Restricted Maximum Likelihood and Marginal Likelihood Estimation of Semiparametric Generalized Linear Models
Simon N. Wood · 2010 · Journal of the Royal Statistical Society Series B (Statistical Methodology) · 7.2K citations
Summary Recent work by Reiss and Ogden provides a theoretical basis for sometimes preferring restricted maximum likelihood (REML) to generalized cross-validation (GCV) for smoothing parameter selec...
Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors
Jennifer A. Hoeting, David Madigan, Adrian E. Raftery et al. · 1999 · Statistical Science · 4.1K citations
Standard statistical practice ignores model uncertainty. Data analysts typically select a model from some class of models and then proceed as if the selected model had generated the data. This appr...
Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author)
Leo Breiman · 2001 · Statistical Science · 4.1K citations
There are two cultures in the use of statistical modeling to reach\nconclusions from data. One assumes that the data are generated by a given\nstochastic data model. The other uses algorithmic mode...
Nearly unbiased variable selection under minimax concave penalty
Cun‐Hui Zhang · 2010 · The Annals of Statistics · 3.8K citations
We propose MC+, a fast, continuous, nearly unbiased and accurate method of penalized variable selection in high-dimensional linear regression. The LASSO is fast and continuous, but biased. The bias...
Flexible smoothing with B-splines and penalties
Paul H.C. Eilers, Brian D. Marx · 1996 · Statistical Science · 3.6K citations
B-splines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number al...
High-dimensional graphs and variable selection with the Lasso
Nicolai Meinshausen, Peter Bühlmann · 2006 · The Annals of Statistics · 2.4K citations
The pattern of zero entries in the inverse covariance matrix of a multivariate normal distribution corresponds to conditional independence restrictions between variables. Covariance selection aims ...
Reading Guide
Foundational Papers
Start with Efron et al. (2004) for Lasso foundations via least angle regression, then Hoeting et al. (1999) for Bayesian model averaging addressing uncertainty, followed by Breiman (2001) contrasting statistical cultures.
Recent Advances
Study Wood (2010) on REML for smoothing selection and Zhang (2010) on nearly unbiased MC+ penalty; extend to Fan et al. (2014) big data challenges and Molinaro et al. (2005) resampling comparisons.
Core Methods
Penalized regression (Lasso, MC+), information criteria (AIC/BIC), cross-validation, Bayesian averaging, smoothing penalties (REML/GCV), stability selection.
How PapersFlow Helps You Research Model Selection Criteria
Discover & Search
Research Agent uses citationGraph on Efron et al. (2004) to map Lasso evolution from least angle regression to high-dimensional extensions like Meinshausen and Bühlmann (2006). exaSearch queries 'AIC BIC high-dimensional consistency' retrieves 200+ papers with filters for Annals of Statistics. findSimilarPapers expands from Wood (2010) REML-GCV comparisons to smoothing criteria.
Analyze & Verify
Analysis Agent runs runPythonAnalysis to simulate AIC vs BIC on high-dimensional datasets, verifying finite-sample performance via bootstrap resampling. verifyResponse (CoVe) cross-checks claims against Hoeting et al. (1999) BMA tutorial with GRADE scoring for evidence strength. readPaperContent extracts asymptotic proofs from Zhang (2010) MC+ penalty derivations.
Synthesize & Write
Synthesis Agent detects gaps in high-dimensional BIC applications via contradiction flagging across Efron et al. (2004) and Fan et al. (2014). Writing Agent applies latexSyncCitations to compile model comparison tables and latexCompile for publication-ready appendices. exportMermaid generates flowcharts of selection workflows from Breiman (2001) two cultures.
Use Cases
"Compare AIC BIC finite-sample performance in high-dim regression via simulation"
Research Agent → searchPapers 'AIC BIC high-dimensional simulation' → Analysis Agent → runPythonAnalysis (NumPy bootstrap on 1000 datasets) → matplotlib plots of risk curves vs sample size.
"Draft LaTeX section comparing Lasso MC+ variable selection proofs"
Research Agent → citationGraph Efron 2004 → Synthesis Agent → gap detection → Writing Agent → latexEditText proofs + latexSyncCitations (Zhang 2010, Meinshausen 2006) → latexCompile PDF.
"Find GitHub code for stability selection in high-dim Lasso"
Research Agent → paperExtractUrls Meinshausen 2006 → Code Discovery → paperFindGithubRepo → githubRepoInspect (R/Python impls) → runPythonAnalysis verification on sample data.
Automated Workflows
Deep Research workflow conducts systematic review of 50+ Lasso papers: searchPapers → citationGraph → DeepScan 7-step verification → structured report with GRADE scores. Theorizer generates hypotheses on REML-GCV tradeoffs: readPaperContent Wood 2010 → contradiction flagging → theory synthesis from Hoeting BMA. DeepScan analyzes Efron LARS: exaSearch → runPythonAnalysis → CoVe chain on selection consistency.
Frequently Asked Questions
What defines model selection criteria?
Methods balancing fit and complexity, including AIC, BIC, cross-validation, and penalties like Lasso to select parsimonious models from candidates.
What are main methods in high-dimensional model selection?
Lasso with least angle regression (Efron et al., 2004), MC+ penalty (Zhang, 2010), stability selection (Meinshausen and Bühlmann, 2006), and REML for smoothing (Wood, 2010).
Which are key papers?
Efron et al. (2004; 9367 citations) on LARS; Hoeting et al. (1999; 4104 citations) on BMA; Breiman (2001; 4081 citations) on modeling cultures; Wood (2010; 7179 citations) on REML.
What open problems exist?
Consistent debiasing in ultra-high dimensions beyond p>>n; scalable BMA computation; bridging parametric and algorithmic selection (Breiman, 2001; Fan et al., 2014).
Research Statistical Methods and Inference with AI
PapersFlow provides specialized AI tools for Mathematics researchers. Here are the most relevant for this topic:
AI Literature Review
Automate paper discovery and synthesis across 474M+ papers
Paper Summarizer
Get structured summaries of any paper in seconds
AI Academic Writing
Write research papers with AI assistance and LaTeX support
See how researchers in Physics & Mathematics use PapersFlow
Field-specific workflows, example queries, and use cases.
Start Researching Model Selection Criteria with AI
Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.
See how PapersFlow works for Mathematics researchers
Part of the Statistical Methods and Inference Research Guide