GitHunt
PM

Simulation-based sample size tools for prediction models

pmsims: Simulation-based Sample Size Tools for Prediction Models

pmsims is an R package for estimating how much data are needed to
develop reliable and generalisable prediction models. It uses a
simulation-based learning curve approach to quantify how model
performance improves with increasing sample size, supporting principled
study planning and feasibility assessment.

The package is fully model-agnostic: users can define how data are
generated, how models are fitted, and how predictive performance is
measured. It currently supports regression-based prediction models with
continuous, binary, and time-to-event outcomes.

Developed at King’s College London (Department
of Biostatistics & Health Informatics) with input from researchers,
clinicians, and patient partners. See the pmsims project
site
for further
details.

Installation

Install the development version from GitHub:

# install.packages("remotes")
remotes::install_github("pmsims-package/pmsims")

Minimal example

library(pmsims)
set.seed(123)

binary_example <- simulate_binary(
  signal_parameters = 15,
  noise_parameters  = 0,
  predictor_type = "continuous",
  binary_predictor_prevalence = NULL,
  outcome_prevalence = 0.20,
  large_sample_cstatistic = 0.80,
  model = "glm",
  metric = "calibration_slope",
  minimum_acceptable_performance = 0.90,
  n_reps_total = 1000,
  mean_or_assurance = "assurance"
)

binary_example

Get in touch

We welcome questions, suggestions, and collaboration enquiries.


Funding

This work is supported by the National Institute for Health and Care
Research (NIHR)
under the Research for Patient Benefit (RfPB)
Programme
(NIHR206858).

NIHR and KCL logos

The views expressed are those of the authors and not necessarily those
of the NIHR or the Department of Health and Social Care.

Languages

R98.8%CSS0.7%Dockerfile0.5%

Contributors

GNU General Public License v3.0
Created February 6, 2023
Updated March 3, 2026