Simulation studies involve drawing random numbers to understand the properties and behaviour of statistical methods. Statisticians have been using simulation studies since before computers existed (e.g. ‘Student’ in 1908). However, when it comes to simulation studies, we are largely self-taught. It is often hard understand a simulation study, or even its objective. Indeed, the rationale for...
Outlier detection in functional time series is challenging due to temporal dependence and the
coexistence of magnitude, shape, and partially contaminated anomalies. Existing methods often assume independence or rely on model-based approaches, such as the Standard Smoothed Bootstrap
on Residuals (SmBoR), which may perform poorly under model misspecification. Model-free alternatives, such as...
Spatial analyses in epidemiology often rely on accurate geolocation of individuals to estimate spatially structured health outcomes. However, routinely collected surveillance data frequently lack precise residential coordinates, introducing positional uncertainty that can bias spatial inference. This study examines the impact of uncertainty in patient location on the estimated spatial...
In automatic object detection, reliably counting objects remains challenging, particularly in scenarios with densely packed objects, overlapping instances, large scene variability, or multi-class cases.
Common evaluation metrics for object detection are based on Intersection over Union (IoU) and do not directly measure the correctness of the number of detected objects. Consequently, a model...
Missing data is one of the most persistent challenges in environmental monitoring, undermining the reliability of analyses and limiting effective resource management. This issue is particularly critical under European regulations such as the Nitrates Directive (91/676/EEC), which requires accurate monitoring of nitrate concentrations in groundwater to protect ecosystems and public health. Yet,...
When addressing a particular research question using observational data, many decisions must be made during the conceptualization of the statistical analysis plan. This multiplicity of analysis strategies is a well-known problem that leads to high variation in research findings and associated low replicability, since each decision can lead to different results, even if each decision on its own...
Patient reported outcomes (PROs) are routinely used in randomized clinical trials (RCTs) to capture patients’ health status. Symptom-related PROs represent patients’ subjective perception of their health and are often collected multiple times during a clinical trial. For instance, in COPD, breathlessness or cough scores are captured using a small-range ordinal scale (0-4), representing...
Background. Managing patients with multiple chronic conditions is a major challenge in modern health systems, particularly when exercise and lifestyle interventions are delivered in real-world settings. Robust statistical and machine-learning models require carefully designed data structures that capture the complexity of patients’ trajectories, comorbidities, and treatment exposures. In this...
Spatial transcriptomics (ST) is a methodological suite that facilitates the in situ, high-resolution measurement of the transcriptome across a designated tissue section. By integrating transcriptional data with spatial coordinates, ST techniques enable the elucidation of key biological phenomena, including cell-type-specific gene regulatory networks, the spatial patterning of cellular...
Non-communicable diseases (NCDs) impose the largest global burden of morbidity, premature mortality, and healthcare expenditure. To shift from reactive to preventive care, early detection of pre-symptomatic molecular changes is essential. We propose a statistical framework for identifying the most sensitive and robust early molecular predictors of prevalent NCDs — including cardiovascular...
Preclinical experiments form the empirical foundation of translational medicine by assessing the feasibility, safety, and efficacy of new therapeutic approaches. Yet, unlike the highly regulated standards of clinical trials, preclinical research often exhibits substantial methodological heterogeneity, leading to concerns about reproducibility, bias, and the robustness of conclusions. These...
Conventional two-stage procedures for binary-outcome meta-analysis use fixed plug-in estimates of within-study variances and depend heavily on large-sample normal approximations. These assumptions are often untenable and can lead to inaccurate inference, especially in sparse settings. Likelihood-based random-effects models, including the binomial–normal and the hypergeometric–normal (HGN)...
Physiological monitoring often generates data characterized by strong cyclostationarity (circadian rhythm) and sensor artifacts – irregular noise. Conventional models (e.g. ARIMA) often fail to capture the time-varying dependencies or conflate behavioral rhythms with noise. We propose a signal processing framework adapted for digital health data: the Fraction-of-Time (FOT) probability...
Meaningful prediction of when the target number of events will be reached is essential for both sample-size determination and operational planning in event-driven clinical trials. In oncology studies, progression-free survival (PFS) based on RECIST assessments is one of the most commonly used endpoints. Tumor evaluations for determining progression are typically performed at pre-scheduled...
The introduction of standardized reporting guidelines has long been a response to inadequate study descriptions, starting with the CONSORT statement for clinical trials in the 1990s (Begg et al., 1996). One major approach to improve transparency and methodological rigor has been the introduction of standardized reporting guidelines such as the ARRIVE guideline (Percie du Sert et al., 2020). ...
Title: Joint modelling of general and mental health using copula models: a simulation-based evaluation for COVID-19 health research.
COVID-19, the disease caused by the SARS-CoV-2 coronavirus, led to a global pandemic that began in December 2019. In the UK, government-mandated lockdowns were imposed to reduce the spread of the disease and understanding the impact of these actions on the...
This study evaluated the performance of the XGBoost method for imputing missing values in air quality data. The analysis used complete measurements of PM2.5, PM10, SO₂, NO, NO₂, and C6H6 recorded in Lublin in January 2020. To simulate missing data, 15%, 20%, and 25% of observations were randomly removed from each variable and imputed using XGBoost trained on the remaining data. Additionally,...
One of the most serious effects of globalisation and human activity on the environment is air pollution. Nitrogen dioxide is particularly harmful to human health. Monitoring its content in the air over a long period of time allows trends to be assessed, and appropriate measures to be taken to improve air quality. In Poland, the Chief Inspectorate for Environmental Protection and its regional...
The present study investigates the spatial variability of alder (Alnus) pollen concentrations across different regions of Poland during the period 2001–2020. The primary objective was to identify and classify areas within the country that exhibit similar levels of alder pollen occurrence. The analytical results enabled the delineation of zones characterized by comparable concentrations of the...
Selecting clinically meaningful cutoff values for continuous prognostic variables is challenging when the association with risk is U-shaped and competing risks are present. We propose a C index-based method to estimate an optimal pair of cutoff values (c_1,c_2) by directly targeting discriminative accuracy. Our approach first fits a smoothing spline to the log relative hazard from the Fine and...
This paper presents an analysis of the impact of data clustering on the accuracy of Weibull distribution parameter estimation in strength tests of mineral fertilizer granules. Two approaches are compared: traditional clustering into fixed-width intervals and optimal clustering, derived from a correctly constructed Fisher information matrix for clustered data. Maximum likelihood estimators for...
Borrowing external data for use in clinical trials has gained popularity in recent years as such data can be utilized to improve efficiency of the current trial. This is needed especially in trials where recruitment of patients is difficult, for example in rare diseases. To avoid borrowing in the presence of significant bias between the current and external data, the external information is...
Genetic susceptibility plays a particularly important role in early-onset (EO) and severe periodontitis (PD). The genetic risk remains largely unexplained, due to limited sample sizes and heterogeneous phenotypes in genome-wide association studies (GWAS). This study investigates whether current GWAS data can be used to construct a polygenic score (PGS) capturing genetic susceptibility to...
The riverine ecosystems of the Biała and Czarna Lada River valleys are undergoing progressive degradation, accompanied by the spread of invasive plant species. To identify the factors driving these processes, we combined predictive modelling with multivariate analyses. To estimate the odds of habitat invasion and degradation, we used logistic regression and classification trees, which revealed...
Introduction
Rapid results from antimicrobial susceptibility testing (AST) are essential to guide the antimicrobial therapy of critically ill patients. Recent developments have revealed that readily available matrix-assisted laser desorption-ionization-time of flight (MALDI-TOF) mass spectrometry data, which is routinely used for bacterial species identification, can also be used to predict...
Realise D is a public-private partnership of almost 40 partners from academia, regulatory bodies, clinical research institutes and hospitals, patient organizations, pharmaceutical companies, methodologists, and European Research Infrastructures. Realise D is part of the European Union’s Innovative Health Initiative and funded jointly by the EU and industry. The project started officially in...
In oncology trials, tumour-based endpoints, like Progression-Free Survival (PFS), Disease-Free Survival (DFS) or Relapse-Free Survival (RFS), are widely accepted. However, their use is more controversial compared with Overall Survival (OS), due to the subjectivity of tumour assessment, and their sensitivity to censoring rules, as they are more prone to obtaining different results depending on...
The aim of the study was to verify changes in the association between place of residence and depression prevalence before and after the onset of COVID-19 pandemic. Second objective of the study was to identify indicators of social capital as determinants of the prevalence of depression among people aged 50 years or older living in rural and urban areas.
The study included data from two...
Survival of breast cancer patients treated at Tikur Anbessa Specialized and Teaching Hospital
Hospital, Addis Ababa, Ethiopia.
Fatuma Hassen, Fikre Enquselassie, Ahmed Ali, Adamu Addissie, Girma Taye, Mathewos Assefa, Aster Tsegaye
Abstract
Purpose: Globally, breast cancer is the most commonly diagnosed cancer and the leading cause of cancer-related deaths in women. The purpose of this...
The growing role of social network sites in building and maintaining social relationships, generating social support, and providing information implies the need to develop a tool for measuring online social networks, intended for use in population studies related to health and quality of life. Models based on Item Response Theory are widely used in test development and has proven advantages...
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition characterized by difficulties in social interaction, communication, and restricted and repetitive behavior patterns. In Brazil, according to the Instituto Brasileiro de Geografia e Estatística (IBGE, 2022), approximately 2.4 million individuals have been diagnosed, and this number may reach six million when unidentified cases are...
Background: Globally about 9 million neonates are diagnosed with birth asphyxia yearly. In Tanzania 40.6% of all neonatal deaths are attributed to birth asphyxia. There is scarcity of evidence on predictors of in-hospital survival among asphyxiated neonates in Tanzania, therefore study aimed to determine trends and predictors of survival among neonates who sustained birth asphyxia at...
Background:
Oncology trials remain the largest sector of global drug development. However, their complexity, resource demands, and modest success rates underscore the need for more efficient, patient-centred, and methodologically innovative designs.
Objective:
To provide a longitudinal assessment of interventional oncology trial characteristics and design trends from 2000 through 2025...
The win ratio statistic has gained prominence as an interpretable method for analyzing composite endpoints in clinical trials, typically with a superiority objective. The use of the win ratio requires simulation to estimate the necessary sample size (1). Adapting win statistics to non-inferiority trials and incorporating covariate adjustment remain unresolved methodological challenges...
The pharmaceutical industry has a long history with Bayesian statistics. Already in 1986, Racine, Grieve, Flühler and Smith wrote an article entitled Bayesian Methods in Practice: Experiences in the Pharmaceutical Industry[1], highlighting four typical applications they encountered at that time. Since then, Bayesian methods have been applied to many more problems in the pharmaceutical...
Bootstrap calibration grounds on a simple idea: Based on a bootstrap sample, one can compute the bootstrap coverage probability of the desired interval. Then, one can alternate the intervals limits until the bootstrap coverage probability approaches the nominal level, e.g. by alternating the α-level used for interval calculation. Finally, the desired interval is calculated replacing the...
Analysis of covariance (ANCOVA) assesses the effect of a group factor on a response while accounting for covariate information. We propose a nonparametric ANCOVA based on Mann-Whitney effects, specifically designed for randomized trials. Unlike classical ANCOVA, our approach does not rely on distributional assumptions or metric-scale data; Ordinal measurements (such as Likert-scale items) are...
Introduction:
In clinical trials time to event endpoints like time to death, time to hospitalization or time to myocardial infarction are often or primary interest. Although multiple events might be observed per individual, only the time to the first occurring event is considered in primary analysis. One reason for this could be that guidelines recommend analyzing the data using the same...
This talk will present the closed testing procedure as it was first introduced by Marcus et al. (1976). We discuss further developments and early contributions published in the following years. A special focus is given to conferences as the ones in Oberwolfach, Bad Ischl and especially Gerolstein. We also report from the first International Conferences on Multiple Comparison Procedures (MCP)...
Aging is the dominant risk factor for neurodegenerative and systemic diseases, yet its molecular signatures remain obscured within high-dimensional, noisy, and strongly correlated proteomes. To address this challenge, we introduce the Protein Risk Score (ProtRS) framework—a systematic evaluation framework for ProtRS modeling that assesses how different multivariate approaches extract...
The strict control of the studywise Type I error rate has long been a cornerstone of confirmatory clinical trials. Closed testing and adaptive designs are two influential ideas in modern trial methodology, yet they emerged from different motivations: one from the need to rigorously control multiplicity when testing multiple hypotheses, the other from the desire to build flexibility into study...
Polygenic risk scores can be used to model the individual genetic liability for human traits. Current methods primarily focus on modeling the mean of a phenotype neglecting the variance. However, genetic variants associated with phenotypic variance can provide important insights to gene-environment interaction studies. To overcome this, we propose snpboostlss, a cyclical gradient boosting...
A common goal in medical research is to estimate a difference between treatment groups and quantify its uncertainty, or to infer a population-level difference. The most commonly used nonparametric group difference measure is the Mann-Whitney (MW) effect. It applies to a broad range of outcomes, including skewed, heteroskedastic and ordinal distributions, since it does not assume a parametric...
Multiple-use prediction and calibration for all future values play a valuable role in many areas including health and medical research. Simultaneous tolerance bands (STBs) can be used for these purposes. Motivated by real-world problems in health research, this study focuses on the construction of exact STBs for multiple regression over any given rectangular covariate region and for polynomial...
Researchers in biomedical research often analyse data that are subject to clustering. Independence among observations are generally assumed to develop and validate risk prediction models. For survival outcomes, the Cox proportional hazards regression model is commonly used to estimate an individual’s risk at fixed time horizons. The stratified Cox proportional hazards and the shared gamma...
The Bayesian approach in general has a lot to offer in times of Machine Learning (ML) and Artificial Intelligence (AI). The Bayesian framework itself offers a learning environment, where the prior, and, subsequently the posterior distributions can be updated sequentially, and where human expertise can be incorporated. The approach allows for uncertainty quantification of all quantities of...
In the context of a two-group comparison, when the assumption of equal variances between groups is doubtful or the data may be skewed or ordinal, the classical t-test and an effect measure parameterized in terms of means may no longer be suitable. In such cases, it appears more appropriate to formulate the problem as the nonparametric Behrens-Fisher problem of testing H0: θ = 1/2, where θ =...
Accurate analysis of multiple time-to-event endpoints is a persistent challenge in clinical research, where patients may experience several recurrent non-fatal events alongside a competing fatal event. Conventional survival analysis approaches, such as time-to-first-event analyses or the Cox proportional hazards model, often neglect recurrent events or assume independence between event types,...
Background: The stability of a drug product over time is a critical property in pharmaceutical development. A key objective in drug stability studies is to estimate the shelf-life of a drug, involving a suitable definition of the true shelf-life and the construction of an appropriate estimate of the true shelf-life. Simultaneous confidence bands (SCBs) for percentiles in linear regression are...
In this talk, we will explore the relationship between the closed testing principle for multiple tests with family-wise error rate (FWER) control and the partitioning plus projection principle for constructing simultaneous confidence intervals. Starting with the simple observation that a multiple test with FWER control is formally equivalent to a one-sided simultaneous confidence interval for...
Regularized Multi-Omics Regression Modeling for Transcriptomic–Proteomic Integration in Mice with induced liver Damage.
Toxicological compounds exert complex effects on tissues and organisms, which can be investigated using genomic, transcriptomic, and proteomic data. A central challenge lies in understanding the relationship between RNA and protein levels. While these are expected to be...
Lessons learned in the last 25 years
Gerhard Nehmiz, consultant for Boehringer Ingelheim Pharma GmbH & Co. KG, Biberach, Germany
gerhard.nehmiz.ext@boehringer-ingelheim.com
The Working Group "Bayes methods" of the IBS / German Region was founded in 2001, and met a need. It had two roots: The WG "Prognosis and decision making" of the GMDS (U. Mansmann) and the German BUGS User Group...
In toxicology, concentration-response experiments are conducted to investigate the toxic behaviour of a given substance. Typically, a parametric model is fitted and effective concentrations to a viability level p (EC_p) are estimated which are used e.g. in further experiments. However, in previous research, it was observed that the estimated EC_p of the same experiment conducted in the same...
Feedback is pervasive in biological and biomedical systems, yet many causal discovery methods, including widely used score-based approaches such as NOTEARS, impose acyclicity and may therefore misrepresent gene regulatory, pharmacological, or cellular processes. Building on recent advances in cyclic causal inference, such as the intervention-capable Bicycle method, we investigate how...
The closed testing principle is a fundamental framework to construct multiple testing procedures controlling the familywise error rate in the strong sense. However, a major challenge in the application of the principle is the number of intersection hypothesis tests that need to be specified, which increases exponentially in the number of elementary hypotheses tested and makes it difficult to...
Prognostic Models for Recurrent Event Data
Dr Victoria Watson1,2, Prof Catrin Tudur Smith2, Dr Laura Bonnett2
1 Phastar, London, UK
2 University of Liverpool, Department of Health Data Sciences
Background / Introduction
Prognostic models predict outcome for people with an underlying medical condition. Many conditions are typified by recurrent events such as seizures in epilepsy....
Single-cell RNA sequencing has given researchers unparalleled insight into biological systems. It enables the identification of distinct cellular subpopulations, the characterization of differences between them, and the assessment of overall tissue heterogeneity. Conventional analysis pipelines first cluster individual cells into similar groups and then test for differentially expressed genes...
Penalized regression models such as Lasso, Elastic-Net and their adaptive extensions are widely used for simultaneous variable selection and prediction in high-dimensional data analysis. However, conventional implementations of adaptive Elastic-Net (AdaENet) regression often estimate the adaptive hyper-parameter for the Elastic-Net penalty term using the entire dataset before dividing it into...
Linear hypotheses Hp = y regarding a parameter vector p arise in a wide range of scientific fields, including life sciences, psychology, economics, environmental sciences, and other areas of applied statistics, due to their ability to encode a wide variety of scientific questions using a simple algebraic framework. The unknown parameter vector p can represent, for example, an expectation...
We consider the problem of testing multiple null hypotheses, where a decision to reject or retain must be made for each one and embedding incorrect decisions into a real life context may inflict different losses. We argue that traditional methods controlling the Type I error rate may be too restrictive in this situation and that the standard familywise error rate may not be appropriate. For...
Background
Longitudinal observational data frequently involve time-varying confounding, autoregressive dependence, and potential reciprocal feedback between processes. These features complicate the estimation of cross-lagged causal effects and challenge the assumptions underlying standard modelling approaches. Methodological evaluation requires transparent, reproducible simulation frameworks...
Various estimators for modelling the transition probabilities in multi-state models have been proposed, e.g., the Aalen-Johansen estimator, the landmark Aalen-Johansen estimator, and a hybrid Aalen-Johansen estimator. While the Aalen-Johansen estimator is generally only consistent under the rather restrictive Markov assumption, the landmark Aalen-Johansen estimator can handle non-Markov...
Background: Post-COVID Condition (PCC) affects a substantial proportion of individuals following SARS-CoV-2 infection, and the mechanisms driving symptom persistence remain an area of active research. Identifying risk factors associated with PCC development is important for targeted prevention strategies and clinical management. Machine learning (ML) models offer powerful tools for prediction...
The recently conducted observational Embryotox cohort study on mRNA COVID-19 vaccination aimed to assess the safety of mRNA COVID-19 vaccines in pregnancy. Here, we focus on the methodological approach used to assess the effect of the vaccination on adverse pregnancy outcomes such as spontaneous abortion and stillbirth. The data featured delayed study entry and cohort crossover as well as...
Functional Data Analysis (FDA), focusing on data composed of functions or curves, has become increasingly popular. We study reliable methods for comparing multiple groups of functional data, especially in studies involving several factors or complex designs. We introduce a new statistical approach designed for multivariate functional data. Our methods are reliable because they allow us to...
Understanding the relative importance of genetic, molecular and environmental factors is crucial for interpretable prediction models in biomedicine and for targeted prevention. While classical regression-based approaches provide direct interpretability through model coefficients, flexible machine learning (ML) approaches such as random forests and neural networks typically rely on post-hoc...
In this talk we present a statistical approach to evaluate the relationship between variables observed in a two-factors experiment. We consider a three-level model with covariance structure ${\bf \Sigma} \otimes {\bf \Psi}_1 \otimes {\bf \Psi}_2$, where ${\bf \Sigma}$ is an arbitrary positive definite covariance matrix, and ${\bf \Psi}_1$ and ${\bf \Psi}_2$ are both correlation matrices with...
Maintaining balance is a crucial daily skill, and impairments in postural control increase the risk of falls, particularly among older adults. This study aimed to assess the effect of attentional control on postural stability in young and older adults. The sample consisted of 43 participants (16 older adults, 12 women; 27 young adults, 13 women). Participants performed a series of 60-second...
Non-linear regression models are flexible approaches used to model complex associations. In many recent proposals, additional flexibility comes at the cost of loss of interpretability of the model's parameters and, consequently, of the data analysis results. We introduce a flexible model whose parameters are easily interpretable. In particular, the model incorporates non-linear effects through...
We consider a two-arm randomized clinical trial in precision oncology with time-to-event endpoint. Patients in the control arm receive standard of care (SOC) treatment whereas patients in the experimental arm are offered personalized treatment, e.g. on the basis of molecular characterization of the disease. However, some patients in the experimental arm will not receive personalized treatment...
Our systematic review indicates that optimal design methods are not yet applied in immunization studies in which modeling the antibody kinetics, i.e. the change of antibody concentration over time, is the main objective. We argue that this substantial underutilization is driven by several factors, including limited awareness of the advantages of optimal design and accessibility of convenient...
Introduction
Metabolomics measures small molecules (called metabolites) in cells, tissues, biofluids, that represent intermediates and/or end-products of biochemical/cellular processes. As a results, metabolomics has shown to be useful for predicting disease risks or associated biomarkers. Given the large data complexity and size, the Machine learning (ML) approach represents an appropriate...
Testing independence between functional observations remains a fundamental challenge in modern statistics, particularly in settings involving high-dimensional or infinite-dimensional random objects. The presented work introduces a new framework for independence testing in functional data based on the distance of mean embedding (DIME), a metric recently proposed as a flexible alternative to...
Adaptive and, in particular, group-sequential designs are well-established in clinical trials. Time-to-event endpoints pose particular challenges because individual participants can contribute data to multiple stages of the trial. Nevertheless, the log-rank test - the standard analysis method for time-to-event data - can be embedded in flexible adaptive designs (e.g. with sample-size...
Bayesian graphical models are powerful tools to infer complex relationships in high dimension, yet are often fraught with computational and statistical challenges. If exploited in a principled way, the increasing information collected alongside the data of primary interest constitutes an opportunity to mitigate these difficulties by guiding the detection of dependence structures. For instance,...
Evaluating a response variable in relation to exposure time or dose is a pivotal objective in the assessment of a compound's effect, particularly when determining toxicity in pre-clinical research or pharmacokinetics in clinical trials. The determination of an alert, such as the EC50 value, at which a pre-specified threshold of the response variable is crossed, is an important tool for the...
The assessment of allogeneic stem cell transplantation (SCT) over standard continued chemotherapy in a clinical trial of childhood leukaemia is not straightforward. Standard chemotherapy will be stopped and SCT performed if a donor search identifies a suitable stem cell donor in registries of potential donors. Randomization to SCT or continued chemotherapy is usually not feasible due to...
This paper focuses on drawing information on underlying processes, which are not directly observed in the data. In particular, we work with data in which only the total count of units in a system at a given time point is observed, but the underlying process of inflows, length of stay, and outflows is not. The particular data example looked at in this paper is the occupancy of intensive care...
Objectives: To evaluate whether machine learning (ML) applied to comprehensive claims data without diagnostic codes can distinguish a high proportion of antibiotic treatment episodes as urinary tract infection (UTI) or non-UTI cases. Such approaches may be valuable for antimicrobial stewardship when diagnosis-linked datasets are unavailable.
Methods: Outpatient antibiotic prescription claims...
Quadratic forms, such as the rank-based Wald-type statistic or the rank-based ANOVA-type statistic, are widely used to compare multivariate distributions without the necessity of parametric assumptions (like multivariate normality). These tests have two major limitations, however:
i) They are, by construction, omnibus tests and thus not able to locate which specific dimensions (variables) are...
Introduction
Monitoring the clinical performance of healthcare units (e.g. hospitals, surgeons) is the main component for national audits, enabling identification of ‘outlier’ units whose clinical performance, e.g. in-hospital mortality, deviates significantly from expected performance. Accurate detection and subsequent management of outliers are critical for improving healthcare quality. ...
Clinical trials often show treatment curves that diverge early and converge later, or vice versa—patterns that are poorly captured by the proportional-hazards assumption. We develop a joint inferential framework for two nonparametric functionals of censored survival data: the Kaplan–Meier–based Mann–Whitney effect and a novel temporal contrast separating early and late differences. The...
Single-cell technologies provide an unprecedented opportunity for dissecting the interplay between the cancer cells and the associated tumor microenvironment, and the produced high-dimensional omics data should also augment existing survival modeling approaches for identifying tumor cell type-specific genes predictive of cancer patient survival. However, there is no statistical model to...
Genome-wide association studies (GWAS) often identify genomic regions containing hundreds or thousands of genetic variants with comparable statistical evidence. Extensive linkage disequilibrium (LD) and the sparsity of causal variants obscure association signals, hindering the identification of true causal variants underlying complex traits. Fine-mapping approaches are introduced to...
Background: Risk prediction models are increasingly being used in clinical practice to predict health outcomes. These models are often developed using data from multiple centres (clustered data) where patient outcomes within a centre are likely to be correlated. It is important that the dataset used to develop a risk model is of an appropriate size, to avoid model overfitting problems and poor...
In many trials and experiments, subjects are not only observed once, but multiple times, resulting in a cluster of possibly correlated observations. For example, mice sharing the same cage or students of the same class are typical examples of clustered data. Typically, under the assumption of normally distributed data, mixed models are used for analysis.
However, this model assumption is...
Time-to-event variables are among the most relevant primary efficacy endpoints in clinical trials, particularly in later phase oncology trials. When the proportional hazards assumption is expected to be severely violated, an alternative to the log-rank test is needed. Testing for differences in survival probabilities at a pre-defined time point offers one such option and has already been...
Metabolite discovery can provide insights into disease mechanisms and help to identify potential biomarkers that contribute to the development of new treatments. We present a self-supervised deep learning approach for metabolite discovery. Molecular intensity distributions obtained via MALDI-MSI (matrix-assisted laser desorption/ionization mass spectrometry imaging) are compared with...
In many one-sample testing settings, one encounters the situation that the null hypothesis of interest is not precisely known. A prominent example arises in single-arm phase II trials, where the goal is to compare the response rate of a treatment to that of an established gold standard. In practice, the true response rate of the gold standard is virtually always unknown and hence must be...
Early clinical trials play a critical role in drug development. The main purpose of early trials is to determine whether a novel treatment demonstrates sufficient safety and efficacy signals to warrant further investment (Lee & Liu, 2008). The new open source R package phase1b is a flexible toolkit that calculates many properties to this end, especially in the oncology therapeutic area. The...
The use of combination treatments in early phase oncology trials is growing. The objective of these trials is to search for the maximum tolerated dose combination from a pre-defined set. However, cases in which the initial set of combinations does not contain one close to the target toxicity level pose a significant challenge. There is uncertainty around how to handle these situations...
When estimating the treatment effect after a group sequential test or a more complex adaptive design, the maximum likelihood estimate is liable to be biased. The ICH E20: Guideline on Adaptive Designs for Clinical Trials has “reliability of estimation” as a key topic. Methods have been developed to reduce the bias in estimators after group sequential and adaptive designs – or even eliminate...
The COVID-19 pandemic has led to excess mortality worldwide. Notably, the reported numbers of excess deaths are different from the numbers of deaths from COVID-19. Evaluating pandemic-related mortality should therefore not only be based on cause-of-death data but also on external life tables to enable calculation of population-based measures of the difference between observed and expected...
Optimal designs maximize the experimental efficiency and precision, but are sometimes difficult to obtain, especially in cases with non-trivial underlying model functions. A possible application area providing the motivating example is toxicology. Liver carcinoma cells are modelled as a function of valproic acid (VPA) concentration using the common four-parameter log-logistic (4PLL) model. To...
Longitudinal or clustered data often arise in clinical research, potentially violating the independent and identically distributed (i.i.d) assumption. In regression, (generalized) linear mixed-effect models are frequently used to account for the correlation structure of the data, but these come with restrictions such as the linearity assumption and pre-specification of predictors and their...
Fast-track procedures play an important role in the registration of health products, such as registration processes for digital health applications. These procedures offer the potential for patients to access innovative products earlier. The procedures involve two registration steps. Applicants can first apply for conditional registration. A successful conditional registration provides a...
The concept of interim analysis and adaption is more and more used in clinical trials. Furthermore, one often has several endpoints such as progression-free survival (PFS) and overall survival (OS) which is discussed in the current paper of Danzer et al. (2022). There, testing hypotheses with adaptive group sequential one-sample tests for the distribution of PFS and OS is developed. The...
Bayesian Methods in Registered Clinical Trials: A Systematic Review of Studies on ClinicalTrials.gov Through 2024
Giles Partington & Christina Geyer: Phastar
Bayesian methods are increasingly being incorporated into clinical trial designs to improve flexibility, efficiency, and interpretability. Earlier reviews of published studies (Lee & Chu, 2012) illustrated how these approaches were...
Machine learning (ML) models have emerged as a powerful alternative to traditional statistical methods due to their flexibility and ability to leverage large-scale, high-dimensional datasets. However, in sensitive application areas such as clinical and prognostic modeling, deploying ML models requires interpretability in order to reveal underlying model behavior, identify influential risk...
Reliable estimation of treatment effects is essential for the benefit–risk assessment supporting the approval of new drugs and for the communication of trial results in the European Public Assessment Report (EPAR) and Summary of Product Characteristics (SmPC). In adaptive designs, where trial adaptations such as sample size re-assessment, population enrichment, or treatment arm selection are...
Regression models for the hazard function have been proposed on both a multiplicative and an additive scale. In medical research the former is often suitable, but in some instances, it is more biologically plausible to assume an additive effect on the mortality rate. The best-known example is in population-based cancer patient survival, where the presence of cancer is assumed to have an...
Graph-based multiple testing procedures provide an intuitive way to define closed testing strategies that control the family-wise error rate (FWER) in fixed sample settings [1]. They have been extended to adaptive trial designs based on the (partial) conditional error rate (CER) method [2]. These procedures control the FWER in two-stage designs where the trial is adapted after an interim...
In adaptive clinical trials, the conventional point estimators of the treatment effect are prone to bias. Similarly, the conventional confidence intervals are prone to incorrect coverage, as well as other undesirable statistical properties. Recent regulatory guidance, such as ICH E20, has highlighted the need to use adjusted estimators and confidence intervals for adaptive designs in order to...
Introduction:
Machine learning (ML) validation studies can often be tackled with standard statistical inference methods, i.e. confidence intervals and statistical tests. While this is reasonable in many situations there are also conditions under which the usual IID assumption is not met, and operating characteristics (coverage probability, type 1 error rate) may thus deteriorate. For...
The Bayesian Logistic Regression Model (BLRM) with Escalation With Overdose Control (EWOC) is widely used in Phase I Oncology trials. Recently, several publications have highlighted a recurring issue: escalation can be blocked even when observed data strongly suggest safety. I.e., the posterior overdose probability at the next dose remains above the EWOC threshold despite no dose-limiting...
In many medical applications of event-history analysis, individuals may experience several intermediate events before death, and a non-negligible proportion of deaths is unrelated to the disease under study. While standard multi-state models evaluate the occurrence of different events over time, they do not explicitly model mortality from disease-related causes and from other (population)...
We propose a frequentist, adaptive trial design to investigate the safety and efficacy of three dose levels compared to placebo for the treatment of worm infections. As the safety of the highest dose is not yet established, the study starts with the two lower doses and the control arm. Based on safety and efficacy endpoints observed in an interim analysis, it is decided to either continue...
Effective air quality regulation and climate change mitigation depend on reducing greenhouse gas emissions and air pollutants. To achieve sustainable green development, this study constructs a spatio-temporal hierarchical model to assess the Air Quality Index (AQI) across Nigerian states and to derive actionable insights for environmental sustainability. Specifically, this study develops a...
In many real-world datasets, observations are hierarchically structured, such as students nested within classrooms, hospitals within cities, or repeated measurements from the same patient. Performing machine learning without accounting for this clustered structure can lead to biased predictions and misleading interpretations of feature effects.
Recently, Mixed Effect Machine Learning, an...
Clinical trials have become more complex in recent decades. They have gradually become longer, involving more centers and more patients. With these tendencies, interim analyses of ongoing trials have become much more common and much research has been done on the design and analysis of data from adaptive trials. By now group-sequential trials are probably more frequent than single-stage...
In the era of precision medicine with increasing molecular information, the use of a multi-state model is required to capture the individual disease pathway along with underlying etiologies with greater precision. Especially the availability of big data with numerous covariates induces several statistical challenges for model building. For multi-state models based on high-dimensional data,...
Relative survival techniques are often used to assess excess mortality in a specific study population by splitting observed mortality into background and excess components. These methods have been widely used to estimate cancer-specific mortality without the need for precise cause-of-death data for cancer patients. However, applying these techniques to other settings, such as pandemics or...
Background: We consider clinical trials in which the experimental treatment may have heterogeneous effects across pre-specified patient subpopulations. In such settings, two-stage adaptive enrichment designs allow the enrolled population to be modified at an interim analysis. In stage 1, patients are enrolled from the full population, and based on interim data and preplanned selection rules,...
Background: Traditional binary classification assessment in machine learning relies heavily on decision thresholds, limiting interpretability and performance in imbalanced scenarios. While metrics like AUC under ROC (Receiver Operating Characteristic curve) provide overall performance measures, they fail to deliver class-specific insights, which is crucial for real-world applications with...
Concentration-dependent cytotoxicity experiments are frequently used in toxicology. Although it has been reported that an adequate choice of concentrations, i.e., the design, substantially improves the quality of statistical inference, a recent literature review of three major toxicological journals showed that these methods are rarely used in toxicological practice.
In this talk, we address...
Concentration-response curves model the relationship between a concentration of a compound and the response it elicits in a biological system. Here, the viability of cells is considered as response. Typically, parametric models are fitted to the data. Modeling this relationship accurately is crucial for understanding the safety and potency of compounds, since one of the applications of these...
Preclinical research is rich with nontrivial design problems that demand statistical leadership. These opportunities exist across in vivo and high-throughput in vitro research systems where statisticians can materially improve translational fidelity by aligning biological questions, design, and analysis to support decision making. In this talk, I will discuss examples that have arisen from a...
In this study, we investigate the individuality and information content of infrared molecular profiles derived from blood samples in a large, longitudinal health-profiling cohort and compare them to a standard clinical laboratory panel. Using Fourier-transform infrared spectroscopy, we obtained comprehensive molecular fingerprints from 4,704 self-reported healthy individuals over five visits...
The pseudo-observation regression approach provides a flexible alternative to the omnipresent proportional hazards model when modeling time-to-event outcomes. In this approach, estimands representable as expectations are fitted to regression models using covariates of interest. Exemplary estimands that fit this framework are the restricted mean time lost (in competing risks models) or the...
In genetic association studies, Mendelian Randomization (MR) is a popular tool for inferring causal relationships between traits using genetic variants as instrumental variables. Recent methods have been proposed as tools that can infer the causal direction between two phenotypes including MR Steiger, bidirectional MR, causal direction-ratio, causal direction-Egger, and causal direction-GLS....
Multiverse analysis offers a powerful framework to assess the robustness of statistical inferences across a spectrum of plausible analytical choices. However, when applied to predictor selection, especially in high-dimensional settings, the issue of multiplicity becomes critical. In this study, we present a comprehensive simulation framework to evaluate the impact of different multiple testing...
Multivariable regression models are a powerful statistical tool with an innumerable number of applications in explanatory and predictive settings. One key challenge is variable selection —deciding which variables to include or exclude, particularly when dealing with large numbers of candidate predictors. In biomedical data, non-linear relationships between the candidate predictors and the...
Occam’s Razor suggests that, among several plausible explanations for a phenomenon, the simplest is preferable. Applied to regression analysis, this implies that the smallest model that fits the data is best. Therefore, in terms of analyzing high-dimensional time-to-event data, variable selection techniques are required, if we want to follow the principle of Occam's Razor. A widely used...
Heart transplantation is widely regarded as the gold standard for the treatment of end-stage heart failure. However, shortages of donor hearts necessitate the implementation of waiting lists and allocation algorithms. The German Transplantation Law stipulates the allocation of donor hearts based on urgency of and the benefit from a transplantation. This can be summarized into a single score,...
Despite the availability of vaccines, infectious diseases such as COVID-19, tetanus, diphtheria, and pertussis remain persistent public health threats, particularly among vulnerable populations including pregnant and lactating women. As most research on protection against infectious diseases to date has focused on antibody-mediated responses, understanding how antibodies behave over time...
Early detection of lethal diseases such as lung cancer requires resolving faint signals amid biological heterogeneity. Precision screening aims to sensitively detect meaningful departures from an individual’s baseline by considering individual-level rather than population-level variability. This work investigates whether infrared molecular fingerprinting (IMF) - mid-infrared vibrational...
Preclinical studies often operate under strict ethical, logistical, and financial constraints, resulting in experiments with very small sample sizes. These limitations pose substantial challenges for statistical inference, reproducibility, and the reliability of decision-making in early-phase biomedical research. This talk provides an overview of key design and analysis issues in preclinical...
In regression modeling, relationships between continuous predictors and outcomes are often assumed to be linear, yet allowing for non-linear associations can substantially improve model performance. A variety of methods for flexible regression—such as fractional polynomials and spline-based approaches—have been proposed to model non-linear associations. However, comprehensive and systematic...
Meta-analysis can be formulated as the combination of p-values from multiple studies into a joint p-value function, from which inference for the average effect, including point estimates and confidence intervals, can be derived. We extend Edgington's p-value combination method for random-effects meta-analysis by treating the combined p-value function as a confidence distribution of the average...
We consider the following prediction problem using observational data obtained from routine health-care visits. Biomarkers such as blood pressure and cholesterol are repeatedly measured over time, resulting in sparse and irregular longitudinal data for thousands of individuals. In addition, we observe corresponding survival outcomes, such as the time to cardiovascular disease or death, which...
Personalized medicine aims to improve the treatment of complex diseases by tailoring therapies to the individual molecular characteristics of patients. This is possible by using multi-omics data, which combine different molecular modalities from the same individuals. Integrating these modalities allows more comprehensive and powerful modeling. However, their unique characteristics make...
Background: Childhood immunization influences directly and indirectly fourteen out of the seventeen sustainable development goals (SDGs). Timely receipt of vaccines protects children from deadly diseases and increases the overall future productivity of the population. With the largest and most heterogeneous population of under-five children, the delay in receiving polio vaccination has not...
Translating preclinical results into effective clinical treatments remains a big challenge in biomedical research. Too often, findings from single-laboratory studies fail to replicate in the preclinical context and further, to show effectiveness in clinical trials. One promising approach to validate exploratory findings is through confirmatory multi-laboratory preclinical trials—studies that...
Mixed-effects models (MEMs) are widely used in epidemiology to analyze data not being independent and identically distributed (i.i.d.) like longitudinal data. However, MEMs rely on parametric assumptions and require predefined interactions among predictors. In contrast, machine learning (ML) methods such as random forests (RF) assume i.i.d. data but are more flexible in capturing nonlinear...
In descriptive studies, where the primary goal is to identify key predictors of a time-to-event outcome, and in predictive research involving numerous candidate predictors, data-driven variable selection methods are often employed to narrow down the pool of variables. This is particularly necessary when domain expertise is limited or when the practical utility of a prediction model is...
Virtual Control Groups (VCGs) represent an approach in which historical control data (HCD) from previous animal studies are used to replace animals in current control groups. The VICT3R project (Developing and implementing VIrtual Control groups To reducE animal use in toxicology Research), funded by the Innovative Health Initiative (IHI), aims to reduce the use of animals in toxicological...
Breast cancer remains one of the most common cancers among women worldwide. Breast cancer screening programmes aim to catch the disease at its early phase, by regularly examining asymptomatic women for signs of cancer. The rationale is straightforward: early detection, before symptoms onset, offers patients broader treatment options and improves the chances of recovery. To evaluate the cancer...
Functional data analysis (FDA) has become increasingly popular in medical biometry and statistics. It is often appropriate to model observations by smooth curves or functions for example in the situation of observations that are sampled quite dense over time or space or in case of high-dimensional repeated measurements as FDA methods allow a flexible modelling. Furthermore they does not assume...
Reducing the number of experimental units is one of the three pillars of the 3R principles (Replace, Reduce, Refine) in animal research. At the same time, statistical error rates need to be controlled to enable reliable inferences and decisions. This paper proposes a novel measure to quantify the evidentiary value of one experimental unit for a given study design. The experimental unit...
In this study, PERMY data set taken from Pharmaceutical Statistics Using SAS: A Practical Guide is analyzed. It describes permeability of cell membranes, which is the ability of a molecule to cross a membrane. Biological structures are a complex layer of molecules and proteins. Substances require a particular structure to pass through the target membrane and drugs that fail to demonstrate...
Assessing micronutrient status is essential in nutritional research (Allen, 2025) and typically involves estimating the population wide prevalence of micronutrient deficiencies using biomarker data collected across multiple regions. In such studies, several biomarkers are commonly analyzed to estimate the prevalence of any deficiency, defined as the probability that at least one of the...
Large and high-dimensional biomedical datasets (large n and p) such as genotype data containing hundreds of thousands of genetic variants (SNPs) measured across many individuals require scalable algorithms to enable efficient model training. In this work, we address this challenge by leveraging principles of optimal design and informative subsampling.
We investigate the applicability of...
Repeated measures data are commonly encountered in a wide variety of disciplines including business, agriculture and medicine. They entail collection of multiple measurements from the same unit or subject over time, space or both. The fact that observations from the same unit will not be independent poses particular challenges to the statistical procedures used for the analysis of such data....
Quantifying the similarity between two or more datasets is an important task in statistics and machine learning. In meta-learning, it enables the transfer of knowledge across tasks and datasets. In simulation studies, the similarity between the distributions assumed in the simulation and the distributions of the datasets for which the performance of methods is assessed is crucial. Similarly,...
The servEB project (WISS 2025, federal state of Salzburg, 20102/F2300645-FPR) combines
clinical expertise, advanced statistical analyses, and AI-driven imagine classification technology to improve the assessment of trial outcomes in rare diseases, especially Epidermolysis Bullosa research as an example. When defining meaningful endpoints, multiple aspects of the disease have to be considered,...
In field studies, measurements are often collected over extended periods, during which subtle shifts in data quality or instrument performance can occur. Recognizing and quantifying such measurement heterogeneities over time is essential to ensure the validity of study results and to intervene at an early stage if possible. However, the performance of available statistical approaches for...
Background: Heart Rate Asymmetry (HRA) represents a specialized domain of Heart Rate Variability (HRV) analysis, quantifying the unequal contribution of accelerations and decelerations to the overall heart rate variations. While HRA provides unique insight into the nonlinear dynamics of autonomic control, its assessment has traditionally relied on high-resolution Electrocardiography (ECG)....
In the functional response model (FRM), where a functional response is explained by scalar predictors, inference becomes challenging when the design matrix is not full-rank, leading to an ill-conditioned model (ICFRM). Widely used methods for this problem, such as $L^2$-norm-based tests (Zhang, 2013), suffer from critical flaws such as poor control of the type I error rate, which can...
Scientific integrity is the cornerstone of progress in biomedical research. Nowhere is this more critical than in nonclinical settings. Reproducibility – the ability to consistently replicate findings across studies, laboratories and organisations is – is not just a technical requirement. It is a fundamental attribute that underpins trust. As nonclinical research continues to expand in...
Over the past two decades, the problem of selecting relevant variables in high-dimensional data analysis has gained particular importance in both statistics and machine learning. Despite substantial advances in modeling techniques and numerous algorithmic proposals, most existing approaches overlook the issue of missing observations — a phenomenon ubiquitous in real-world datasets, especially...
The assessment of crop variety distinctness, uniformity, and stability (DUS) is a fundamental component of plant breeding and registration processes. Traditionally, one-dimensional analysis of variance is conducted separately for each attribute. However, before conducting separate analyses, it would be worthwhile to apply multivariate methods to determine whether a given variety differs from...
Binary endpoints are commonly used to measure clinical outcomes in randomized controlled trials. In this context, conditional odds ratios (ORs) based on logistic regression have been routinely used as population-level summary to quantify treatment effects. However, ORs have been criticized for a lack of interpretability, non-collapsibility, and sensitivity to model specification. In response,...
The principles of Replacement, Reduction, and Refinement (3Rs) have become fundamental to modern biomedical research. In this context, Virtual Control Groups (VCGs) offer a promising strategy to reduce the number of animals used in toxicological and pharmacological studies. Rather than including concurrent control groups (CCGs) in every experiment, VCGs rely on historical control data...
Longitudinal observational studies and clinical trials routinely collect extensive phenotypic data under changing organisational, technical, and environmental conditions. Variations in examiners, devices, protocols, or ambient factors can introduce consequential forms of measurement heterogeneity and measurement error over time. Although these sources of bias are well recognised, systematic...
In modern data analysis, technological advancements frequently result in the collection of Functional Data (FD), where observations are naturally represented as smooth functions, curves, or surfaces over a continuum (e.g., time or space). Examples include daily stock prices, continuous temperature recordings, or spectroscopic measurements. Functional Data Analysis (FDA) offers a powerful...
Low rates of replicability in early phase biomedical research hinder progress and putatively cause high attrition rates in clinical trials. To improve evidence generation processes, preclinical confirmatory studies and preregistration offer potentially effective strategies. By comparing conduct and outcome of preclinical studies utilizing such strategies, we examined how different degrees of...
Background:
Multiple endpoints are a major topic of discussion in rare disease research, particularly regarding to patient-centered outcome measures, as they allow for a more comprehensive assessment of treatment effects. However, a critical challenge in these trials is allocation bias, as they are often unblinded or single-blinded. Allocation bias arises when future treatment allocations can...
In pharmaceutical research and preclinical development data below the lower limit of quantitation are quite common although sometimes not properly dealt with. Beyond time-to-event settings measured data above a general or even subject specific upper limit of quantitation are less common.
Malignant tumor cells can metastasize. When tumor cells metastasize they might cause new tumors called...
Background: Tattoos and permanent make-up (PMU) gain increasing popularity, yet their potential systemic health implications remain poorly understood.
Methods: To investigate associations between tattoos/PMU and chronic disease outcomes, we analyzed data from the LIFE-Adult Study, a population-based cohort of 10,000 adults recruited in Leipzig, Germany (2011–2014). A dedicated...
The aim of the work is to find important characterizations of mixture of experts which have
an impact on improvement of combined classifier performance over the averaged
performance of the base learners. The problem was examined for various high
dimensional genomic data sets.
Mixture of experts are useful for responses differentiating among base classifiers.
From this point of view...
In clinical development it is essential to identify subgroups of patients who exhibit a beneficial treatment effect, ideally before moving to confirmatory trials. Such subgroups are often defined by predictive biomarkers with corresponding cut-off values. However, data-driven selection of biomarkers or cut-offs introduces selection bias, i.e. the treatment effect within the selected subgroup...
Animal experiments are often purely exploratory, with little to no data available to support the planning phase. Nonetheless, ethical guidelines demand scientifically sound biometric planning. The experimental designs are typically complex, involving numerous experimental groups and adaptive steps, which complicates statistical planning.
In recent years, statistical aspects of such...
High-quality data are essential for reliable epidemic surveillance. Traditional systems relying on passive case reporting that may lead to unreliable prevalence estimates depending on the specific disease. Using the example of the COVID-19 pandemic, we show that once prevalence exceeds moderate levels, conventional reporting becomes biased and unstable. Beyond this point, drawing additional...
In many clinical trial analyses, missing data is addressed through multiple imputation (MI) to avoid loss of information and potential bias. However, this approach is not taken into consideration at the planning stage when calculating the sample size. Here, it is common practice to inflate the calculated sample size by an estimated dropout rate in order to maintain the desired power. This...
Background:
A random forest (RF) is an efficient method for prediction but it is difficult to
interpret.
Artificial Representative Trees (ARTs) are a special type of surrogate model
that approximates the original strucutre of the RF in a single tree, achieving
similar predictive accuracy.
Conformal Predictive Systems (CPS) provide a framework for uncertainty
quantification by generating...
Even in rare diseases, where the sample size is limited and blinding is less frequently implemented, randomized controlled trials are considered the gold standard to proof efficacy. Randomization is used to mitigate bias and regulatory guidance recommend the investigation of the impact of bias on the test decision. We quantified how allocation bias affects the test decision in small sample...
Non-experimental data, such as electronic medical records, are often used in causal inference to estimate the effect of an exposure on an outcome of interest. However, this type of data can be affected by potential sources of bias in causal analyses. For example, these data do not come from a study design that ensures a balance of patient characteristics between exposure groups, a problem...
In preclinical animal studies, researchers often have a certain degree of freedom when it comes to selecting the exact statistical analysis strategy for their experiment. Ideally, this analysis strategy should be specified prior to the experiment (and preregistered, if possible), with sample size planning conducted in accordance with the chosen analytical approach. Sample size calculations...
Device-based assessment of physical behaviour (PB) is essential in health and behavioural research, to evaluate the impact of interventions, and to examine diverse health outcomes. Accelerometers are widely used but converting tri-axial signals to PBs remains challenging. Machine learning (ML) is a promising approach for classifying PBs from accelerometer data. However, the performance of ML...
Sharing of original study data may be restricted by data protection policies. Instead, synthetic data that mimics the original data structure may be shared between research groups. This work introduces modgo 2.0 which may be used for generating synthetic data from existing study data. Simulations may be based either on the combination of the rank inverse normal transformation with simulation...
In rare diseases, the need for innovative clinical trial designs is increasing. Platform trials are becoming particularly popular, as they allow for flexible adding and dropping of arms and reduce sample size requirements by using a shared control. In a platform trial setting with two experimental arms and one control, we use clinical trial simulations to quantify the impact on operating...
Meta-analyses synthesise the results of multiple independent studies to obtain more comprehensive knowledge about a research topic. When study outcomes vary, meta-regression can be used to identify potential sources of heterogeneity across studies. One complication is the typically small number of studies available. Due to this, interaction terms are often omitted in meta-regression models,...
For many medical research questions, randomization is unethical or infeasible and decisions have to be informed by results based on observational – often routinely collected - data. Such data have enormous potential to inform stakeholders including health policy makers, health professionals and the general public about the impact of their decisions on public as well as individual health....
The FDA initiated Project Optimus and issued guidance for dose optimization, recommending randomized parallel dose-response cohorts to generate additional data at promising dose levels and implies that different dosages may be needed for different indications. In addition to dose optimization, with recent advancements in precision medicine and cancer biology, the development of cancer...
Regeneration of forest ecosystems is crucial for preserving their structure, function and long-term stability. This research analyses the correlation and similarity of the occurrence of 11 species characteristic only for the regeneration phase with respect to some soil properties. Aim of research is to perform grouping of species according to soil properties, and to determine differeces in...
Sport-related concussions (SRCs) represent a major public health concern, accounting for more than 200,000 annual Emergency Department visits in the United States. Biomechanically, SRCs arise from head impacts that generate high-magnitude linear and rotational accelerations. Increasing evidence from human studies indicates that repetitive head impact exposure (HIE) reduces concussion tolerance...
Artificial intelligence (AI) is intended to support clinicians, therapists, patients, hospital managers, and clinical data scientists at all levels. This includes, for example, making clinical diagnoses, understanding the causes of diseases, and planning clinical studies. The enormous increase in the importance of AI in medicine has led to the development of several guidelines (e.g.,...
In epidemiology dose-response meta analysis often refers to fitting a meta regression model that describes a linear trend in the outcome ("response") as a function of the exposure ("dose"), based on aggregated data from a number of studies.
Fixed- and random-effects extensions for handling nonlinear dose-response for odds ratios, relative risks and differences in means through the use of...
Reference intervals and standard deviation scores (‘z scores’) are widely used as diagnostic tools in various biomedical fields. They are applied to laboratory parameters in clinical chemistry, psychometric tests in neurology, or parameters of children’s growth in pediatrics. Usually, samples from a ‘normal’ or ‘healthy’ population form the data basis for the estimation of reference...
Basket trials examine the efficacy of a single intervention simultaneously in several patient subgroups. They are currently mostly applied in oncology, where the subgroup assignment is based on medical characteristics such as a common biomarker. This can result in small sample sizes within subgroups that are also likely to differ. Several designs for the analysis of basket trials have been...
Classification plays a pivotal role in medicine for both diagnostic and prognostic purposes. Traditionally, diagnostic efficacy is evaluated using prevalence-independent metrics, such as sensitivity and specificity. For numerical tests, the Area Under the Receiver Operating Characteristic (ROC) curve is the standard for assessing classification success. However, the rising adoption of machine...
Modern large language models (LLMs) have reshaped workflows of people across countless fields - and biostatistics is no exception. These models offer novel support in drafting study plans, generating software code, or writing reports. However, reliance on LLMs carries the risk of inaccuracies due to potential hallucinations that may produce fabricated "facts", leading to erroneous statistical...
Updating a meta-analysis (MA) by including additional studies is usually a straightforward exercise, as the relevant data are commonly reported in detail, i.e., effect estimates with standard errors for all studies. Matters are complicated, however, when only the summary of a previous analysis is available, i.e., the overall estimate with standard error. For instance, this is sometimes the...
In Germany, cultivars are tested for regional recommendations in federal state cultivar trials, taking the form of multi-environment trials (METs). Their primary objective is to identify cultivars that are best suited for regional production in agro-ecological zones. For perennial ryegrass, current selection decisions are predominantly based on yield. Incorporating additional quality...
Genome-wide association studies (GWAS) for biomarkers and molecular phenotypes can lead to clinically relevant discoveries. Numerous lines of evidence from both model organisms and human studies suggest that genetic associations can be highly heterogeneous, dynamic and context dependent. Despite twenty years of GWAS, most studies are based on statistical models that fail to account for such...
Environmental covariates (ECs) have become increasingly abundant and accessible over the past two decades, driven by advancements in remote sensing, data acquisition technologies, and the declining costs of environmental monitoring. Incorporating ECs into multi-environment trials (METs) has several applications, including improving the understanding of genotype-by-environment interactions,...
The accuracy of diagnostic tests is commonly evaluated by estimating the area under the receiver operating characteristic curve (AUC), as well as sensitivity and specificity at given diagnostic cut-offs. However, many diagnostic trials use factorial designs. For example, different combinations of readers and methods may be used to diagnose a patient. Furthermore, diagnostic studies may...
Modern therapeutic agents in cancer therapy often target specific genetic traits of the tumor. Whenever these traits are independent of the tissue in which the tumor is located, the therapeutic agent may be tissue-agnostic, meaning that it can be applied regardless of location. Clinical trials for such tissue-agnostic therapies often have small sample sizes. Hence, it is efficient to recruit...
Meta-analyses often involve transforming bounded effect size measures, such as correlation coefficients or odds ratios, onto a real-valued scale prior to estimation. The results are then back-transformed to the original scale for interpretation purposes. However, in the standard random effects model for meta-analysis, simply applying the inverse transformation function generally does not yield...
Artificial intelligence (AI) is increasingly being used in various disciplines. Examples of this include medical image processing, complex prediction and decision support models and thereby integrating with the field of biostatistics. In this context, integrating AI and machine learning (ML) methods within courses of biostatistics taught to students of medicine, health and life sciences offers...
Protein degradation is a regulated process that reshapes the proteome and generates bioactive peptides. Peptidomics and degradomics enables large-scale measurement of these peptides, yet most
data analyses approaches treat peptides as isolated endpoints rather than intermediates produced
by sequential cleavage. Here, we introduce degradation graphs, a probabilistic framework that represents...
Background:
A substantial proportion of clinical research waste originates from fundamental methodological flaws—improper study design, insufficient power, inappropriate statistical methods, and non-compliance with reporting guidelines. While many AI tools attempt to support data analysis, none address the critical upstream phase: validating methodology before data collection. To address this...
In many biomedical research settings, sufficiently large sample sizes can only be achieved by combining data from multiple collection sites (e.g., hospitals). However, pooling individual participant data in a central server is often restricted due to privacy and regulatory constraints. Federated inference addresses this challenge by distributing the statistical analysis across local sites,...
Background
Triple-negative breast cancer (TNBC) represents one of the most aggressive and treatment-resistant breast cancer subtypes. Patients with locally advanced unresectable or metastatic TNBC (mTNBC) typically face a median overall survival of only 8 to 13 months, highlighting the urgent need for efficient drug evaluation strategies. Conventional statistical methods often assume...
New crop varieties are extensively tested in multi-environment trials in order to obtain a solid basis for recommendations to farmers. When the target population of environments is large, a division into sub-regions is often advantageous. If the same set of genotypes is tested in each of the sub-regions, a linear mixed model (LMM) may be fitted with random genotype-within-sub-region effects....
Bottom-up mass spectrometry-based proteomics studies changes in protein abundance and structure across various biological conditions. Since the currency of these experiments are peptides, i.e. subsets of protein sequences that carry the quantitative information, conclusions at a different level, e.g., at the level of proteins or of post-translational modifications, must be computationally...
The use of precision agriculture contrasts with the challenge posed by the high cost of commercial technologies, particularly for small-scale producers. For this reason, it is necessary to develop low-cost, accessible solutions that can be applied directly in the productive environment. Within this context, this work presents the development of a low-cost system for acquiring 3D images of beef...
Meta-analysis of diagnostic test accuracy (DTA) studies deals with aggregating information from multiple studies on sensitivity and specificity. Classical approaches to this task select a single pair of sensitivity and specificity per study (single threshold methods, STM), possibly ignoring additional information if studies report results on multiple diagnostic thresholds. In recent years,...
Title:
Evaluating Nonparametric Combination Methods for Aggregating N-of-1 Trials: A Simulation-Based Comparison with Meta-Analysis
Abstract:
Aggregating results from multiple N-of-1 trials has become increasingly relevant for evaluating personalized and digital health interventions, where inter-individual heterogeneity and complex temporal structures challenge traditional study designs....
Confidence distributions are a frequentist alternative to Bayesian posterior distributions. They summarize the knowledge and uncertainty about an unknown model parameter in the form of a probability distribution on the parameter space, just like a posterior distribution, without assuming that the parameter of interest is a random variable. Although confidence distributions are a relatively old...
In healthcare provider profiling, accurately assessing hospital performance is crucial for informed decision-making and quality improvement. Traditional approaches rely heavily on parametric regression models for risk adjustment, but these methods often fail to account for between-center heterogeneity and may produce biased estimates, especially in the presence of low event rates or small...
Randomized trials often utilize a select group of study participants. This group does not typically represent the general population. Furthermore, sample sizes are often small to reduce cost. To improve power and generalizability, external control groups may be added to the randomized study. It is possible to incorporate a suitably selected external control group into a randomized clinical...
Functional data analysis has established itself as a powerful framework for analyzing data recorded over continuous domains such as time. Within this context, functional motif discovery refers to the identification of recurrent patterns that appear multiple times across different portions of a single curve and/or within misaligned portions of multiple curves. In this study, we explore the...
Prediction in the presence of missing values is a complex and still poorly understood problem, particularly when future records also contain missing values.
Mertens, et al. (2020) demonstrate that with non-linear models (such as logistic regression or Cox survival) and when using imputations, averaging of multiple predictions obtained from distinct models fitted on imputed data should be...
In oncology drug development, phase II dose-finding studies are essential to identify the most promising dose levels for confirmatory phase III trials. Traditionally, dose selection is based on the maximum tolerated dose, which does not necessarily correspond to the optimal dose in terms of efficacy and safety. To address the challenge of dose optimization, the Oncology Center of Excellence of...
Growing concerns about the reproducibility, generalisability and (more recently) credibility of biomedical research publications underscore the need for methods that both synthesise evidence and diagnose weaknesses in the research ecosystem. Systematic review and meta-analysis of animal studies is traditionally used to evaluate preclinical efficacy and inform future research in animals or...
Experimental designs with orthogonal block structures are commonly used in many areas of science in order to control the external sources of variability. The aim of this study is to compare several analysis of variance (ANOVA) methods applicable to such structures. Comparing these approaches is of practical importance, as the choice of the analytical method may influence inference about...
Randomised controlled trials (RCTs) are the gold standard of evidence to support causal conclusions on the benefits and risks of medicines in regulatory decision making along the lifecycle [1]. However, single-arm trials (SATs) are also frequently used for various reasons during drug development. While RCTs allow adjustment for confounding via design, the contextualization of SATs requires...
In a prospective study of patients with muco-obstructive lung disease, aiming to develop a cough alert system based on nocturnal cough monitoring, to identify patient-individual thresholds at which cough frequency exceeds normal variability, so far 92 of intended 220 patients were included.
From den Brinker et al. in a study with 30 COPD patients it is known, that the day-to-day variation of...
Although not without controversy, readmission is entrenched as a hospital quality metric with statistical analyses generally based on fitting a logistic-Normal generalized linear mixed model. Such analyses, however, ignore death as a competing risk, although doing so for clinical conditions with
high mortality can have profound effects; a hospital’s seemingly good performance for readmission...
Quantitative analysis of microbial growth curves is essential for understanding how bacterial popu-
lations respond to environmental cues. Traditional analysis approaches make parametric assumptions
about the functional form of these curves, limiting their usefulness for studying conditions that distort
standard growth curves. In addition, modern robotics platforms enable the...
Preclinical studies tend to suffer from an unacceptably low rate of replicability, which is highly problematic since unreliable results from animal trials cannot provide a sound foundation for subsequent clinical research. Numerous factors contribute to this issue, including the misapplication of statistical methods, poor study design, and inadequate reporting of results. Although specific...
Bayesian dynamic borrowing (BDB) methods are popular for incorporating historical data in rare disease or paediatric clinical trials, in particular with regard to control groups. They can be used to leverage the historical information while mitigating the consequences of potential prior-data conflicts to some degree. However, these methods do not consider baseline covariate information that...
Population-scale genomic biobanks provide unique opportunities for data-driven drug target discovery. However, these resources often lack detailed data on clinical phenotypes, whereas clinical trials offer rich phenotypic information but are limited in omics coverage and mostly lack genotyping. This imbalance creates gaps in the mechanistic interpretation of clinical findings.
To address...
Row-column designs play an important role in applications where two orthogonal sources of error need to be controlled for by blocking. Field or greenhouse experiments, in which experimental units are arranged as a rectangular array of experimental units are a prominent example. In plant breeding, the amount of seed available for the treatments to be tested may be so limited that only one...
Multiple imputation (MI) continues to be a popular approach to deal with missing at-random covariate data. For MI to perform well, it is advisable to ensure that the imputation model for a given covariate does not make conflicting assumptions with substantive/analysis model. In the case of substantive models that assume proportional hazards (e.g., the standard Cox model for a single...
Quality assessment in healthcare frequently relies on quality indicators based on follow-up data tracking patient outcomes after treatment. However, conventional cohort-based indicators require complete follow-up, which can result in substantial lag between data collection and analysis. To enable more timely yearly assessment, we propose a period-based approach, in which all data collected...
The reproducibility of research results is a cornerstone of trustworthy science. However, failures to reproduce published findings remain widespread across many disciplines. The way in which data analysis and statistics is taught to students often translates into how they later perform research in labs and clinics. Therefore, improving the reproducibility of biomedical research requires not...
Fisher (1925) introduced the three principles of experimental design: (i) true replicates, (ii) randomization, and (iii) blocking. The former two are strictly required while blocking often increases precision. That is what we tell our agricultural students. However, in practice, randomization is often ignored, either in the first replicate (van Santen and West, 2012) or completely. Often, the...
The estimation of a precision matrix is a crucial problem in various research fields, particularly when working with high dimensional data. In such settings, the most common approach is to use the penalized maximum likelihood. The literature typically employs Lasso, Ridge and Elastic-Net norms, which effectively shrink the entries of the estimated precision matrix. Although these shrinkage...
Handling of missing data is a crucial aspect when preparing data sets for further analyses in several research areas. Previous studies have shown that the choice of imputation method can have a high influence on subsequent analyses, especially in medical research, where missing values often occur due to study design or data collection challenges.
In this study, we conduct a comparative...
Drug development in the era of precision medicine increasingly uses basket trials and other multi-subgroup designs, where targeted therapies are evaluated across biomarker-defined patient subtrials. For many targeted agents and immunotherapies, the objective in early development is no longer the maximum tolerated dose (MTD), but the optimal biological dose (OBD) that achieves the best...
Improved predictions of quality of care indicators in the tail of its distribution
Els Goetghebeur, Ghent University
Standard mixed methods have been popular for evaluating performance across care centers in terms of indicators that summarize residents’ outcomes. Their results tend to lack power, however, for the detection of poor performance [1]. This stems from regression to the...
Wastewater-based epidemiology (WBE) offers a promising approach to assess populationhealth by analysing health related [SM1] markers in sewage. Interpreting such data at fine spatial scalesrequires accurate [DS2] numbers of the contributing population. However, allocating population information tosewersheds is complicated by the lack of spatial congruence between administrative boundaries and...
Background: There is an increasing interest in making use of patient reported outcome measures for provider comparisons, However, guidance on the choice of outcomes and selection of variables for case-mix adjustment for specific patient groups is lacking.
Material: In the ACRF-pred study 973 patients from 19 different clinics were followed after arthroscopic rotator cuff repair for 24...
Missing covariate data is a significant source of bias in observational studies that use propensity score (PS) analysis to make causal inference. The accuracy of treatment effect estimation is determined not just by how missing data is handled, but also by the method used to calculate propensity scores. A variety of methods for handling missing covariate data in propensity score analyses have...
For a given research question and observational dataset, there are often numerous ways to specify the data analysis pipeline that leads from raw data to the result of interest. Data analysts must make a series of choices concerning data preprocessing, variable definitions, and statistical model specifications. For example, analysis pipelines may differ in their inclusion or exclusion criteria...
Assessment of treatment effect heterogeneity is a challenging problem in biostatistics, particularly in clinical trials: Estimation of treatment effects within subgroups in an exploratory setting is often unreliable due to limited sample sizes and multiplicity issues. Through the past decades, many efforts have been made to address this problem. Among them, Muysers et al. (2020) considered...
Mortality risk modeling and forecasting is one of the key tasks of social security institutions and insurance companies. Traditionally used stochastic mortality models, such as the Lee-Carter model, require fulfillment of formal assumptions that cannot always be met in real-life scenarios (e.g. time independence of age-specific improvement rates). Alternative approaches are based on deep...
A range of regularization approaches have been proposed in the literature to overcome overfitting, to exploit sparsity or to improve prediction. Using a broad definition of regularization, namely controlling model complexity by adding information in order to solve ill-posed problems or to prevent overfitting, we review a range of approaches within this framework including penalization, early...
A question that arises in the analysis of adverse events is how to account for patients who withdraw their consent or switch treatment. One approach is to consider consent withdrawal and treatment switch as competing events. Alternatively, patients who withdraw from the study or switch treatment could be censored, but this implies that one assumes censoring due to treatment switch or consent...
Stroke can lead to a wide range of symptoms including acute motor impairment, post-stroke depression or cognitive impairment. Anticipating these outcomes early on and understanding their underlying causes would enable clinicians to initiate appropriate targeted treatments, which could not only help reduce the severity of symptoms but also improve long-term recovery. Convolutional neural...
Douala General Hospital, a first-class healthcare facility in Cameroon, serves thousands of patients yearly through its multidisciplinary medical teams. The hospital hosts numerous patient records that hold significant potential for public health research. However, most records remain paper-based, limiting their accessibility and reuse. In departments such as pulmonology, patient data are...
In statistical analyses of binary outcomes for medical procedures performed by multiple health care providers, provider-specific effects are commonly handled using conditional models with random effects or using marginal models with generalized estimating equations (GEEs). While convenient, these models treat provider effects primarily as nuisance parameters, even though they may themselves be...
As medicine enters an era of precision, the challenge for statistics is no longer whether personalized care is possible, but how best to translate its potential into clinical practice. Zhao et al. (2012) formulated the personalized medicine problem as finding the optimal individual treatment rule (ITR) by maximizing the expected clinical responses. More recently, Lei and Candès (2021)...
This presentation explores the statistical challenges and comparative performance of various deep learning models for the automated detection and classification of neurological diseases from Computed Tomography (CT) and Magnetic Resonance Imaging (MRI) scans. Building upon initial findings that demonstrated the potential of Convolutional Neural Networks (CNNs) in recognizing rare brain...
Simulation studies are widely used to evaluate statistical methods. However, new methods are often introduced and evaluated using data-generating mechanisms (DGMs) devised by the same authors. This coupling creates misaligned incentives, e.g., the need to demonstrate the superiority of new methods, potentially compromising the neutrality of simulation studies. Furthermore, results of...
The IQTIG measures, compares and evaluates hospital quality using quality indicators. These usually consist of a population and a binary outcome of interest, such as whether complications have occurred after elective knee replacement. For a fair assessment and comparison of hospital quality, we need to adjust for the hospital’s case mix, i.e., for the patient-specific risk factors such as age...
Continuous monitoring (CM) at AstraZeneca is the systematic review and evaluation of accumulating study data to inform timely decisions. Rather than waiting for formal interims or study completion, our CM approach in early phase oncology studies, enables earlier data-driven decisions to stop for futility or safety, minimising exposure to ineffective or unsafe treatments, and to accelerate...
Random forest is a widely used machine learning method across the life sciences due to its high predictive performance, minimal assumptions, and flexibility in handling diverse data types. However, a critical yet often overlooked property of random forest is its inherent non-determinism: repeated runs on the same data set can produce different prediction models. This variability can compromise...
Subgroup analyses are frequently reported results from randomized trials. They help to identify heterogeneity in the average treatment effect, which occurs when this average effect varies across different categories of a subgroup factor, like age, sex, or disease severity. If treatment effects are different across subgroups, this information can help to personalize treatment decisions....
Continuous monitoring is becoming more popular due to its significant benefits, including reducing sample sizes and reaching earlier conclusions. In general, it involves monitoring nuisance parameters (e.g., the variance of outcomes) until a specific condition is satisfied. The blinded method, which does not require revealing group assignments, was recommended because it maintains the...
Statistical prediction models for binary outcomes are becoming increasingly popular. One signifi‐
cant challenge is calibrating these models to suit the characteristics of a target population that is
structurally different from the original population. Calibration is especially challenging when there
is no training data available from the target population. To address this problem, we...
Background:
For many years, health research has faced substantial criticism regarding its quality. Appropriate reporting guidelines are available with the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) network acting as an umbrella organization to address reporting issues in health sciences. Nevertheless, many reviews have shown that reporting quality remains poor, which...