← Back to home

popTK — trial outcomes from preclinical data

A hierarchical PK/PD model with only three free parameters per drug that recovered four phase III trials and predicted biomarker-dependent benefit.

Simulated versus observed Kaplan-Meier progression-free survival curves for treatment and control arms — Simulated (dashed) versus observed (solid) progression-free survival for treatment and control arms, parameterised from preclinical data alone.

Problem

More than half of phase III oncology trials fail for lack of efficacy. Preclinical metrics like xenograft tumour-growth inhibition don't translate cleanly to clinical endpoints such as progression-free survival, and most translation models either over-fit a handful of in-vitro endpoints or carry too many free parameters to be identifiable. The goal: predict whether a drug combination will deliver a clinically meaningful PFS benefit before committing to a trial — using only preclinical data.

Approach

I built a hierarchical model that represents drug sensitivity as a continuous latent variable — log-normally distributed across patients and across cells within a tumour — capturing both between-subject and intratumoral heterogeneity with only three free parameters per drug. Tumour growth is computed by integrating a Hill-type growth-rate function over that sensitivity distribution using quasi-Monte-Carlo (Sobol) quadrature; parameters are fit by differential evolution. Crucially, combination efficacy is predicted from monotherapy parameters alone under a Loewe dose-additivity assumption — without fitting to any combination data.

Clinical outcomes were assessed with proper survival analysis — Kaplan-Meier, Cox proportional-hazards, and the log-rank test — with bootstrap confidence intervals and Metropolis-Hastings sampling to confirm parameter identifiability, so the small parameter count is a tested property rather than an assumption.

Result

Across 22 drug combinations in 6 tumour types, simulated and observed PFS were largely statistically indistinguishable. Parameterised from preclinical data alone, the model then recovered four historical phase III trials — MONALEESA-7 (matching both arms, log-rank p = 0.978 and 0.893), COLUMBUS, BR.21, and SOLAR-1 — and correctly predicted biomarker-dependent benefit, fitting separate PIK3CA-mutant and wild-type subsets to show that only mutant patients derive meaningful benefit. As few as ~12 xenograft models proved sufficient for confident predictions in the settings tested.

Methods & stack

Hierarchical latent-variable modelling
PK/PD
Quasi-Monte-Carlo (Sobol) quadrature
Differential evolution
Loewe additivity (combinations from monotherapy)
Survival analysis (KM · Cox · log-rank)
Bootstrap uncertainty quantification
MCMC / Metropolis-Hastings identifiability
Python · NumPy / SciPy · lifelines