Subtopic Deep Dive

High-Dimensional Survival Analysis
Research Guide

What is High-Dimensional Survival Analysis?

High-Dimensional Survival Analysis extends Cox proportional hazards models with Lasso penalization and regularization techniques to handle censored time-to-event data where predictors greatly exceed sample size.

This subtopic adapts classical survival models for high-dimensional settings like genomics, incorporating variable selection via Lasso (Tibshirani, 1996; 50,077 citations) and coordinate descent for generalized linear models (Friedman et al., 2010; 16,182 citations). Methods address censoring, competing risks, and biomarker validation using bootstrap resampling (Efron, 1979; 17,076 citations). Over 50,000 papers cite foundational Lasso work applied to survival contexts.

15
Curated Papers
3
Key Challenges

Why It Matters

High-Dimensional Survival Analysis identifies prognostic biomarkers from omics data in precision medicine, enabling risk prediction under censoring for cancer and cardiovascular studies. Tibshirani's Lasso (1996) powers variable selection in genomic survival models, reducing false positives in high-p dimensions. Friedman et al. (2010) provide scalable algorithms for clinical trial analysis, while Efron (1979) supports inference via bootstrapping for small-sample validation.

Key Research Challenges

Variable Selection Bias

Lasso shrinks coefficients to zero but introduces bias in high dimensions, complicating survival hazard estimation (Tibshirani, 1996). Balancing sparsity and prediction accuracy remains difficult under censoring. Bootstrap methods help assess stability (Efron, 1979).

Censoring and Heteroskedasticity

Right-censoring violates standard regression assumptions, requiring heteroskedasticity-consistent estimators (White, 1980; 25,793 citations). High-dimensional covariates amplify variance in partial likelihoods. Coordinate descent paths aid regularization (Friedman et al., 2010).

Computational Scalability

Fitting penalized Cox models to millions of features demands efficient algorithms like coordinate descent (Friedman et al., 2010). Longitudinal censoring adds complexity (Liang and Zeger, 1986). Software like EZR facilitates medical applications (Kanda, 2012).

Essential Papers

1.

Regression Shrinkage and Selection Via the Lasso

Robert Tibshirani · 1996 · Journal of the Royal Statistical Society Series B (Statistical Methodology) · 50.1K citations

SUMMARY We propose a new method for estimation in linear models. The ‘lasso’ minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a con...

2.

A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity

Halbert White · 1980 · Econometrica · 25.8K citations

This paper presents a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic. This estimator does not depend on a form...

3.

Longitudinal data analysis using generalized linear models

Kung‐Yee Liang, Scott L. Zeger · 1986 · Biometrika · 17.8K citations

This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression par...

4.

Investigation of the freely available easy-to-use software ‘EZR’ for medical statistics

Yoshinobu Kanda · 2012 · Bone Marrow Transplantation · 17.7K citations

5.

Bootstrap Methods: Another Look at the Jackknife

B. Efron · 1979 · The Annals of Statistics · 17.1K citations

We discuss the following problem: given a random sample $\\mathbf{X} = (X_1, X_2, \\cdots, X_n)$ from an unknown probability distribution $F$, estimate the sampling distribution of some prespecifie...

6.

Regularization Paths for Generalized Linear Models via Coordinate Descent

Jerome H. Friedman, Trevor Hastie, Robert Tibshirani · 2010 · Journal of Statistical Software · 16.2K citations

We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multi- nomial regression probl...

7.

Regularization Paths for Generalized Linear Models via Coordinate Descent.

Jerome H. Friedman, Trevor Hastie, Rob Tibshirani · 2010 · PubMed · 14.0K citations

We develop fast algorithms for estimation of generalized linear models with convex penalties. The models include linear regression, two-class logistic regression, and multinomial regression problem...

Reading Guide

Foundational Papers

Start with Tibshirani (1996) for Lasso basics in regression, essential for Cox adaptation; follow with Efron (1979) for bootstrap handling censoring uncertainty; White (1980) covers heteroskedasticity in high-dim covariance.

Recent Advances

Friedman et al. (2010) for coordinate descent scaling to survival GLMs; Kanda (2012) for EZR software implementing penalized models in medical stats.

Core Methods

Penalized partial likelihood via Lasso (Tibshirani, 1996); coordinate descent paths (Friedman et al., 2010); GEE for longitudinal censoring (Liang and Zeger, 1986); bootstrap resampling (Efron, 1979).

How PapersFlow Helps You Research High-Dimensional Survival Analysis

Discover & Search

Research Agent uses searchPapers and citationGraph on Tibshirani (1996) to map Lasso extensions to Cox models, revealing 50,000+ citing papers in survival analysis. exaSearch queries 'high-dimensional Cox Lasso penalized' for omics applications, while findSimilarPapers links to Friedman et al. (2010) regularization paths.

Analyze & Verify

Analysis Agent applies readPaperContent to extract Lasso partial likelihood equations from Tibshirani (1996), then runPythonAnalysis simulates Cox-Lasso on censored data with NumPy/pandas for hazard ratio verification. verifyResponse (CoVe) with GRADE grading checks bootstrap CIs (Efron, 1979) against statistical claims, ensuring p-value accuracy.

Synthesize & Write

Synthesis Agent detects gaps in competing risks coverage across Lasso papers, flagging contradictions in variable selection. Writing Agent uses latexEditText and latexSyncCitations to draft Cox model equations citing Tibshirani (1996), with latexCompile generating biomarker diagrams via exportMermaid.

Use Cases

"Simulate high-dimensional Cox model with Lasso on synthetic censored survival data"

Research Agent → searchPapers('Cox Lasso survival') → Analysis Agent → runPythonAnalysis (pandas CoxPHFitter, lifelines library simulation) → matplotlib survival curves output.

"Write LaTeX section on penalized Cox hazards citing Tibshirani 1996"

Synthesis Agent → gap detection → Writing Agent → latexEditText (insert partial likelihood) → latexSyncCitations(Tibshirani 1996) → latexCompile → PDF with equations.

"Find GitHub repos implementing random survival forests from high-dim papers"

Research Agent → citationGraph(Tibshirani 1996) → Code Discovery → paperExtractUrls → paperFindGithubRepo → githubRepoInspect → scikit-survival fork outputs.

Automated Workflows

Deep Research workflow scans 50+ Lasso-citing survival papers via searchPapers → citationGraph, producing structured reports on biomarker validation. DeepScan's 7-step chain applies runPythonAnalysis checkpoints to verify heteroskedasticity in Cox residuals (White, 1980). Theorizer generates hypotheses linking coordinate descent to competing risks models (Friedman et al., 2010).

Frequently Asked Questions

What defines High-Dimensional Survival Analysis?

It combines Cox proportional hazards with Lasso penalization for variable selection in settings where predictors exceed samples under censoring (Tibshirani, 1996).

What are core methods?

Lasso-constrained partial likelihood optimization (Tibshirani, 1996) and coordinate descent for GLM paths (Friedman et al., 2010), with bootstrap for inference (Efron, 1979).

What are key papers?

Tibshirani (1996; Lasso, 50,077 citations), Friedman et al. (2010; coordinate descent, 16,182 citations), Efron (1979; bootstrap, 17,076 citations).

What open problems exist?

Scalable inference under competing risks and double robustness in ultra-high dimensions, extending heteroskedasticity corrections (White, 1980).

Research Statistical Methods and Inference with AI

PapersFlow provides specialized AI tools for Mathematics researchers. Here are the most relevant for this topic:

See how researchers in Physics & Mathematics use PapersFlow

Field-specific workflows, example queries, and use cases.

Physics & Mathematics Guide

Start Researching High-Dimensional Survival Analysis with AI

Search 474M+ papers, run AI-powered literature reviews, and write with integrated citations — all in one workspace.

See how PapersFlow works for Mathematics researchers