Patent Pending

U.S. Provisional Application No. 63/974,099

Systems and Methods for Uncertainty-Calibrated Missing Modality Imputation, Identifiable Separated-State Tumor Dynamics, and Multi-Layer Uncertainty Quantification

Computer-implemented systems for probabilistic missing modality handling via teacher-student self-distillation, separated-state clonal dynamics with replicator equations, three-layer uncertainty quantification, and cancer-type conditional parameter generation.

33 Claims
Back to Whitepapers

CROSS-REFERENCES

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is related to U.S. Provisional Application No. 63/967,576, filed January 25, 2026, entitled “SYSTEM AND METHOD FOR PHYSICS-CONSTRAINED SIM-TO-REAL TRANSFER LEARNING IN COMPUTATIONAL ONCOLOGY,” for common subject matter relating to cross-species tumor trajectory simulation. The present application provides additional written description directed to probabilistic missing modality imputation with uncertainty calibration, separated-state clonal tumor dynamics with burden-fraction decomposition, multi-layer uncertainty quantification for longitudinal prediction, cancer-type conditional parameter generation, adaptive multi-modal fusion, and related systems and methods.

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH

Not applicable.

I.Field of the Invention

The present invention relates to computational oncology, machine learning, multi-modal representation learning, and quantitative systems pharmacology (QSP). More specifically, the invention relates to computer-implemented systems and methods for: (i) uncertainty-calibrated handling of missing modalities in multi-omics tumor modeling via probabilistic teacher-student self-distillation; (ii) structurally identifiable formulations of clonal tumor dynamics using separated-state burden and fraction representations with replicator dynamics; (iii) multi-layer uncertainty quantification for sparse longitudinal prediction including population-prior blending, parametric time-dependent trust reversion, and oscillation damping; and (iv) cancer-type conditional parameter generation and adaptive multi-modal fusion for tumor dynamical models.

II.Background of the Invention

1.Missing Modalities in Multi-Omics and Uncertainty Calibration

Clinical and translational oncology datasets frequently contain missing modalities, including DNA methylation and copy number variation (CNV), due to assay availability, cost, tissue quality, or retrospective data collection. Conventional imputation approaches often replace missing modalities with zeros, means, or learned reconstructions that minimize prediction error without calibration of uncertainty. Such approaches may yield overconfident downstream predictions when key modalities are absent, which is undesirable for clinical decision support and can produce implausible dynamical parameterizations when predictions drive numerical simulation of tumor trajectories.

Standard imputation methods often do not distinguish between observed and imputed values in their uncertainty estimates, potentially resulting in similar predictive variance regardless of data completeness. This may fail to encode the epistemic uncertainty that typically arises from information deficit, potentially resulting in overconfident predictions for clinical decisions when modalities are missing.

2.Identifiability and Constraint Challenges in Clonal Dynamics

Clonal tumor dynamics ordinary differential equation (ODE) formulations that track absolute clone populations Nk(t)N_k(t) directly may exhibit degeneracy wherein initial total burden and growth parameters are confounded. Multiple parameter sets may produce identical trajectory shapes scaled by initial conditions. Furthermore, maintaining clonal fractions within valid ranges and enforcing unity constraints may require external projection or penalty methods during numerical integration, which can complicate gradient-based optimization and introduce numerical stiffness.

3.Sparse Longitudinal Data and Uncertainty Quantification

Longitudinal tumor measurements in clinical practice are often irregular, sparse, and temporally distant from prediction horizons. Many predictive models output point estimates without reflecting increasing uncertainty during temporal extrapolation away from observed data points. Some systems output uncertainty estimates that are not operationally tied to proximity to observations or population priors, potentially producing insufficiently conservative predictions when the model lacks evidentiary support near the prediction time.

Clinical decision-making may benefit from uncertainty estimates that: (a) reflect parameter uncertainty given limited data, (b) revert toward population baselines when extrapolating far from observations, and (c) mitigate spurious oscillations that can arise from numerical integration or model characteristics.

4.Cancer-Type Heterogeneity and Model Efficiency

Tumor growth kinetics, drug response patterns, and immune interactions vary by cancer type. Training separate dynamical models per cancer type may increase operational complexity and reduce statistical power for rare subtypes. Some generic models do not adapt to cancer-type specificity, while separate models per type may not leverage shared biological mechanisms across types.

5.Multi-Modal Fusion and Biological Extensions

Multi-modal oncology systems combining molecular features with histopathological features may benefit from adaptive fusion strategies. Biologically, tumor dynamics are modulated by immune responses with distinct timescales and therapeutic interventions including checkpoint blockade. Stochastic extensions accounting for epigenetic heterogeneity may better represent unpredictable tumor evolution.

III.Summary of the Invention

The invention provides computer-implemented systems and methods addressing the above through integrated technical contributions:

A. Probabilistic Encoder Self-Distillation (PESD) for Missing Modality Imputation

Missing modalities are replaced with learned per-modality embeddings. A Student encoder processes incomplete input while a Teacher encoder processes more-complete input. The Student is trained to match the Teacher posterior distribution via KL divergence, while a one-sided calibration loss penalizes the Student only when its predicted variance falls below the Teacher predicted variance. This penalizes insufficient uncertainty, including overconfidence, for missing-modality inputs and trains the Student to increase uncertainty toward the Teacher uncertainty or above when modalities are missing. The calibrated uncertainty propagates into downstream dynamical parameter generation and trajectory prediction.

B. Separated-State Clonal Dynamics with Replicator Dynamics

The invention decomposes clonal tumor state into total burden B(t)B(t) and clonal fractions fk(t)f_k(t) evolving under coupled ODEs. Burden dynamics follow mean fitness scaling; fraction dynamics follow replicator dynamics that preserve the simplex constraint kfk=1\sum_k f_k = 1 and fk[0,1]f_k \in [0,1] by construction. This formulation improves identifiability and reduces reliance on external constraints during numerical integration.

C. Three-Layer Uncertainty Quantification

Layer 1 provides parameter uncertainty by generating ensembles that blend model-generated parameters with population priors stored in memory, with per-member perturbations. Layer 2 provides temporal trust via a parametric exponential decay function based on distance to nearest observation, blending predictions toward a population baseline. Layer 3 provides oscillation handling by detecting oscillatory patterns after integration and applying severity-graded damping to stabilize reported trajectories without modifying the underlying dynamical integration.

D. Cancer-Type Conditional Parameter Generation

Feature-wise Linear Modulation (FiLM) layers condition hypernetwork parameter generation on learned cancer-type embeddings. Zero-initialized FiLM parameters cause training to begin as an identity transform, with cancer-type-specific deviations learned during optimization. This enables a single network to serve multiple cancer types with type-specific parameter distributions.

E. Optional Dependent Modules

Optional dependent modules include per-sample cancer-conditioned gating that adaptively fuses molecular and histopathological features, immune dynamics separating NK and T-cell compartments with checkpoint modulation, selective domain calibration that bypasses proliferation-linked latent groups during normalization, methylation latent variance coupling to SDE volatility parameters, and training-time collapse prevention constraints including group-wise KL capacity, cross-reconstruction, and modality dropout.

IV.Brief Description of the Drawings

FIG. 1: PESD teacher-student architecture with learned missingness embeddings and one-sided variance calibration.

FIG. 2: PESD training objectives including KL divergence matching and calibration loss.

FIG. 3: Separated-state burden and fraction dynamics with bidirectional coupling.

FIG. 4: Replicator dynamics preserving simplex constraints for clonal fractions.

FIG. 5: Three-layer uncertainty quantification pipeline.

FIG. 6: Parametric time-dependent trust decay and population baseline reversion.

FIG. 7: Oscillation detection metrics and severity-graded damping selection.

FIG. 8: Cancer-type FiLM conditioning in hypernetwork parameter generation.

FIG. 9: Per-sample cancer-conditioned gating architecture.

FIG. 10: Immune dynamics with NK and T-cell separation and checkpoint modulation.

FIG. 11: Selective domain calibration with bypassed latent group.

FIG. 12: Methylation variance coupling to SDE volatility.

V.Detailed Description of the Invention

[0001]System Overview. A computer-implemented system comprises one or more processors and non-transitory memory storing instructions that implement: (i) missing modality handling via probabilistic teacher-student self-distillation with calibrated uncertainty; (ii) tumor trajectory simulation via separated-state ODEs or SDEs; (iii) multi-layer uncertainty quantification; and (iv) cancer-type conditional parameter generation.

In embodiments, the system includes an encoder mapping multi-modal molecular and optionally histopathological inputs to latent representations, a parameter generation network producing dynamical parameters, and a numerical solver integrating dynamical equations to output tumor trajectories and uncertainty measures.

[0001A]End-to-End Embodiment. In an end-to-end embodiment, the systems of Sections A-D are operatively coupled such that: (i) Student posterior uncertainty from PESD influences dispersion of parameter ensemble members generated by Layer 1 of the uncertainty quantification system; (ii) these parameters drive separated-state ODE integration producing raw trajectories; and (iii) Layer 2 trust blending and Layer 3 oscillation damping produce final uncertainty-quantified predictions for clinical evaluation and computational decision support.

[0002]Terminology.

Modality
A data type including gene expression, DNA methylation, CNV, DNA sequence variants, proteomic measurements, or histopathological image features.
Missingness pattern
A binary mask indicating presence or absence of each modality for a sample.
Posterior distribution
A probabilistic latent distribution output by an encoder, for example a diagonal Gaussian with mean and log-variance.
Population prior
A statistical distribution for dynamical parameters stored in memory as mean and standard deviation pairs.
Observation times
Time points at which longitudinal tumor measurements are available for a subject.
Trust
A scalar function controlling interpolation between model predictions and a population baseline prediction based on temporal proximity to observations.

A. Probabilistic Encoder Self-Distillation (PESD)

[0003]PESD Architecture. A Teacher encoder fϕteacherf^{\text{teacher}}_{\phi} receives Teacher input xteacherx_{\text{teacher}} that includes a strict superset of modalities present in the Student input and outputs Teacher posterior parameters (μT,logσT2)(\mu_T, \log \sigma_T^2). A Student encoder fϕstudentf^{\text{student}}_{\phi} receives Student input xstudentx_{\text{student}} where missing modalities are replaced by learned embeddings and outputs Student posterior parameters (μS,logσS2)(\mu_S, \log \sigma_S^2).

[0004]Learned Missingness Embeddings. For each modality mm that may be missing, the system stores a learned embedding vector eme_m as a trainable parameter in non-transitory memory. In one embodiment:

emN(0,σinit2I),σinit=0.02e_m \sim \mathcal{N}(0, \sigma_{\text{init}}^2 I), \quad \sigma_{\text{init}} = 0.02

In embodiments, one or more modalities can be designated as always-present and excluded from missingness embedding replacement. Each learned embedding vector eme_m has dimensionality matching the feature representation of modality mm prior to multi-modal fusion, such that replacement preserves tensor shape compatibility.

[0005]Masked Replacement Operation. Given missingness mask m where mj{0,1}m_j \in \{0,1\} indicates presence of modality jj, the Student input is computed as:

xstudent,j=mjxj+(1mj)ejx_{\text{student},j} = m_j \cdot x_j + (1 - m_j) \cdot e_j

where eje_j is the learned missingness embedding corresponding to modality jj. The replacement is performed per modality according to the missingness pattern.

[0006]Training Objectives.

LKL=DKL(qS(zxstudent)    qT(zxteacher))L_{\text{KL}} = D_{\text{KL}}\bigl(q_S(z \mid x_{\text{student}}) \;\|\; q_T(z \mid x_{\text{teacher}})\bigr)
Lcalib=dmax ⁣(0,  logσT,d2logσS,d2)L_{\text{calib}} = \sum_d \max\!\bigl(0,\; \log \sigma_{T,d}^2 - \log \sigma_{S,d}^2\bigr)

This penalty activates when the Student log-variance is less than the Teacher log-variance for a latent dimension dd. The loss penalizes under-uncertainty for missing-modality inputs and drives the Student toward increasing log-variance toward the Teacher log-variance or above.

Optional Reconstruction Loss: LreconL_{\text{recon}} is defined as a mean squared error (MSE) or negative log-likelihood between the Student encoder's reconstructed modality features and the original input features for designated modalities. In one embodiment, LreconL_{\text{recon}} reconstructs RNA expression features from the shared latent representation. In another embodiment, LreconL_{\text{recon}} reconstructs all available modalities from the Student latent.

LPESD=λKLLKL+λcalibLcalib+λreconLreconL_{\text{PESD}} = \lambda_{\text{KL}} \cdot L_{\text{KL}} + \lambda_{\text{calib}} \cdot L_{\text{calib}} + \lambda_{\text{recon}} \cdot L_{\text{recon}}

where λKL\lambda_{\text{KL}}, λcalib\lambda_{\text{calib}}, and λrecon\lambda_{\text{recon}} are stored weights.

In embodiments, the calibration loss is applied to all latent dimensions output by the Student encoder. In other embodiments, the calibration loss is applied to a designated subset of latent dimensions, such as shared biological latents excluding modality-private channels. In embodiments, gradients are not propagated into Teacher encoder parameters for one or both of LKLL_{\text{KL}} and LcalibL_{\text{calib}} by detaching Teacher posterior parameters from the computation graph.

[0007]Training and Inference Control. A missingness control parameter specifies which modalities are masked during training so that the Student receives masked inputs and a corresponding missingness pattern. The Teacher input comprises either complete modality data for a subset of training samples or a modality subset that is strictly larger than the Student modality subset for a given sample.

In embodiments, the Teacher encoder parameters are fixed during Student training, updated by exponential moving average, or trained jointly with the Student encoder, while maintaining the more-complete input distinction.

At inference, the Student encoder is used when at least one modality is absent, and the calibrated uncertainty propagates through downstream parameter generation and trajectory simulation.

[0008]PESD Calibration Behavior. In evaluated implementations, the one-sided calibration loss drives convergence toward Student variance that is not less than Teacher variance when processing missing-modality inputs, reducing overconfident predictions under missingness while maintaining posterior alignment through KL matching.

B. Separated-State ODE Formulation

[0009]State Decomposition. The system represents clonal tumor state using:

VariableDefinition
Total burdenB(t)R+B(t) \in \mathbb{R}^+
Clonal fractionsf(t)=[f1(t),,fK(t)]\mathbf{f}(t) = [f_1(t), \ldots, f_K(t)], where fk(t)[0,1]f_k(t) \in [0,1] and kfk(t)=1\sum_k f_k(t) = 1

Absolute populations are defined as Nk(t)=fk(t)B(t)N_k(t) = f_k(t) \cdot B(t). In embodiments, the number of clones KK is fixed at initialization.

[0010]Clone Fitness and Growth Modulation. For each clone kk, fitness is computed as:

φk(t)=ρkM(B(t))Dk(t)Ik(t)\varphi_k(t) = \rho_k \cdot M(B(t)) - D_k(t) - I_k(t)

where ρk\rho_k is the clone growth rate, M(B)M(B) is burden-dependent growth modulation, Dk(t)D_k(t) is the drug effect, and Ik(t)I_k(t) is the immune effect.

In embodiments, M(B)M(B) is selected from configuration memory, including:

ModelForm
Logistic1B/Kcap1 - B / K_{\text{cap}}
Gompertzlog(Kcap/B)\log(K_{\text{cap}} / B)
von BertalanffyB1/3B^{-1/3}
RichardsGeneralized growth modulation with shape parameter

Mean fitness:

φˉ(t)=kfk(t)φk(t)\bar{\varphi}(t) = \sum_k f_k(t) \cdot \varphi_k(t)

[0011]Coupled Burden and Fraction ODEs. The system integrates:

dBdt=φˉ(t)B(t)\frac{dB}{dt} = \bar{\varphi}(t) \cdot B(t)
dfkdt=fk(t)(φk(t)φˉ(t))\frac{df_k}{dt} = f_k(t) \cdot \bigl(\varphi_k(t) - \bar{\varphi}(t)\bigr)

Bidirectional coupling arises because B(t)B(t) influences φk(t)\varphi_k(t) through M(B)M(B), and f(t)\mathbf{f}(t) influences φˉ(t)\bar{\varphi}(t).

[0012]Constraint Properties and Numerical Handling. Replicator dynamics preserve the probability simplex under exact integration. In numerical implementations, the system can optionally renormalize f\mathbf{f} when drift exceeds a stored tolerance.

[0013]Identifiability Advantages. The separated-state representation improves identifiability by separating burden scale from composition dynamics, reducing confounding between initial burden and growth parameters, and eliminating the need for external projection constraints to enforce fk[0,1]f_k \in [0,1] and kfk=1\sum_k f_k = 1.

C. Three-Layer Uncertainty Quantification

[0014]Layer 1: Ensemble Parameter Uncertainty with Population Priors. The system generates NN ensemble members. For a parameter θ\theta, an ensemble member can be computed as:

θ(i)=(1α)(θbase+Δθ(i)+ϵ(i))+α(μprior+ϵprior(i))\theta^{(i)} = (1 - \alpha) \cdot \bigl(\theta_{\text{base}} + \Delta\theta^{(i)} + \epsilon^{(i)}\bigr) + \alpha \cdot \bigl(\mu_{\text{prior}} + \epsilon_{\text{prior}}^{(i)}\bigr)

where α\alpha is a stored prior strength, and μprior\mu_{\text{prior}}, σprior\sigma_{\text{prior}} are population prior statistics stored in non-transitory memory.

Population priors can be stored as non-trainable values in memory, including configuration files, model metadata, or other non-transitory storage. In embodiments, such priors are implemented as fixed tensors or other non-trainable data structures.

[0015]Layer 2: Time-Dependent Trust and Population Baseline Reversion. Trust is computed parametrically:

τ(t)=exp ⁣(λminttobs)\tau(t) = \exp\!\bigl(-\lambda \cdot \min |t - t_{\text{obs}}|\bigr)

where λ\lambda is a stored decay rate. The trusted prediction is:

y^trusted(t)=τ(t)y^model(t)+(1τ(t))y^pop(t)\hat{y}_{\text{trusted}}(t) = \tau(t) \cdot \hat{y}_{\text{model}}(t) + (1 - \tau(t)) \cdot \hat{y}_{\text{pop}}(t)

In embodiments, y^pop(t)\hat{y}_{\text{pop}}(t) comprises one or more of: (i) a constant population mean burden for a cancer type, (ii) a population mean trajectory template indexed by time, (iii) a baseline dynamical model using population mean parameters, or (iv) a stored look-up table derived from population cohorts.

[0016]Layer 3: Oscillation Detection and Severity-Graded Damping. After integration, the system computes oscillation metrics, including one or more of: max-min ratio over a window, sign change rate of derivatives, and high-frequency spectral power above a cutoff.

The system assigns a severity label by comparing oscillation metrics to thresholds stored in memory and then applies severity-graded damping as post-processing without modifying the integrated ODE state.

SeverityDamping Method
MildExponential moving average (EMA) smoothing with a stored smoothing factor
ModerateLowpass filtering with a stored cutoff frequency
SevereCombined EMA and lowpass filtering with stored parameters

[0017]Uncertainty Outputs. The system outputs uncertainty-quantified trajectories including point estimates and confidence intervals derived from the ensemble after trust blending and damping, and can output diagnostic metadata including trust values, severity labels, and damping configuration identifiers.

D. Cancer-Type Conditional Parameter Generation

[0018]Cancer-Type Embedding. The system stores a learnable embedding table mapping cancer-type indices to condition vectors c\mathbf{c} of dimension dcd_c. The number of supported cancer types is configurable.

[0019]FiLM Conditioning with Identity Initialization. A FiLM layer computes:

γ=Wγc+bγ\gamma = W_\gamma \, c + b_\gamma
β=Wβc+bβ\beta = W_\beta \, c + b_\beta
h=h(1+γ)+βh' = h \odot (1 + \gamma) + \beta

where hh is the feature vector and \odot denotes element-wise multiplication.

In embodiments, WγW_\gamma, bγb_\gamma, WβW_\beta, and bβb_\beta are initialized to zero so that γ=0\gamma = 0 and β=0\beta = 0 at initialization and the layer starts as an identity transform h=hh' = h. The model then learns cancer-type-specific modulation during training.

E. Optional Dependent Modules

[0020]Collapse Prevention Constraints. In embodiments, the system applies training-time constraints to prevent collapse of modality-specific latent groups, including:

  • Minimum KL capacity per group computed as max(0,  CtargetgKLg)\max(0,\; C_{\text{target}}^g - \overline{\text{KL}}^g) with stored targets and optional warmup.
  • Cross-reconstruction where a modality is reconstructed using shared latents while a private latent is explicitly zeroed.
  • Modality dropout where entire modalities are randomly dropped per sample according to stored probabilities.

[0021]Per-Sample Cancer-Conditioned Multi-Modal Gating. In embodiments, the system fuses molecular features and histopathological features by computing per-sample softmax gate weights from the features and a cancer-type embedding and computing a weighted sum.

[0022]Immune Dynamics with NK and T-Cell Separation and Checkpoint Modulation. In embodiments, immune effects are computed using separate innate and adaptive compartments, including NK dynamics with immediate response and T-cell dynamics with delayed ramp activation, and are modulated by checkpoint evasion and an externally set checkpoint blockade parameter.

[0023]Selective Domain Calibration with Bypassed Latent Group. In embodiments, adaptive batch normalization is applied to selected latent groups while proliferation-linked latent groups are bypassed to preserve proliferation statistics across domains.

[0024]Methylation Variance to SDE Volatility Coupling. In embodiments, the system computes methylation latent variance and provides it as input to an SDE volatility parameter head, thereby coupling epigenetic heterogeneity to stochastic volatility.

VI.Claims

What is claimed is:

Claim 1. (PESD Missing Modality Handling with One-Sided Uncertainty Calibration)

A computer-implemented system for handling missing modalities in multi-modal tumor modeling, comprising one or more processors configured to:

  1. receive multi-modal feature data comprising a plurality of modalities;
  2. determine a missingness pattern indicating which modalities are absent for a sample;
  3. for each absent modality among the plurality of modalities, replace the absent modality with a corresponding learned missingness embedding stored as a trainable parameter, selected according to the missingness pattern, to form a Student input;
  4. process the Student input with a Student encoder to output a Student posterior distribution comprising a Student mean and a Student log-variance;
  5. process, for the same sample, a Teacher input that includes a strict superset of modalities present in the Student input with a Teacher encoder to output a Teacher posterior distribution comprising a Teacher mean and a Teacher log-variance;
  6. train the Student encoder by minimizing a KL divergence between the Student posterior and the Teacher posterior; and
  7. apply a one-sided calibration loss that penalizes the Student log-variance only when the Student log-variance is less than the Teacher log-variance, thereby training the Student encoder to increase Student log-variance toward the Teacher log-variance or above when modalities are missing.

Claim 2. (Separated-State Burden-Fraction ODE)

A computer-implemented system for simulating clonal tumor dynamics, comprising one or more processors configured to:

  1. initialize a state representation comprising a total burden variable B(t)B(t) and clonal fraction variables fk(t)f_k(t) satisfying kfk(t)=1\sum_k f_k(t) = 1;
  2. compute clone fitness values comprising growth rates modulated by a burden-dependent growth modulation term, minus drug effects, minus immune effects;
  3. compute mean fitness as a weighted sum of clone fitness values using clonal fractions as weights;
  4. integrate a coupled ODE system comprising burden dynamics driven by mean fitness and fraction dynamics comprising replicator dynamics driven by fitness differences from mean fitness; and
  5. output trajectories derived from B(t)B(t) and fk(t)f_k(t), wherein the fraction dynamics constrain fk(t)[0,1]f_k(t) \in [0,1] by construction.

Claim 3. (Three-Layer Uncertainty Quantification)

A computer-implemented system for uncertainty-aware tumor trajectory prediction, comprising one or more processors configured to:

  1. generate an ensemble of parameter sets using model-generated parameters blended with population priors stored in non-transitory memory according to a stored prior strength;
  2. integrate a dynamical model for each ensemble member to produce trajectory predictions;
  3. compute a parametric time-dependent trust function decaying exponentially with distance to nearest observation time using a stored decay rate;
  4. blend model predictions with a population baseline prediction stored in memory according to the trust function;
  5. detect oscillatory patterns in the trajectory predictions using oscillation metrics;
  6. determine an oscillation severity level by comparing oscillation metrics to thresholds stored in memory;
  7. apply severity-graded post-processing damping selected from exponential moving average and lowpass filtering based on the severity level; and
  8. output uncertainty-quantified trajectories.

Claim 4. (PESD Calibration Loss Structure)

The system of claim 1, wherein the one-sided calibration loss is computed per latent dimension as max(0,  logσT,d2logσS,d2)\max(0,\; \log \sigma_{T,d}^2 - \log \sigma_{S,d}^2) and summed across dimensions.

Claim 5. (PESD Static Embeddings)

The system of claim 1, wherein the learned missingness embedding for each modality is a static vector initialized from a normal distribution with mean zero and standard deviation 0.02.

Claim 6. (Teacher Fixed Parameters)

The system of claim 1, wherein parameters of the Teacher encoder are fixed during training of the Student encoder.

Claim 7. (Teacher EMA Update)

The system of claim 1, wherein parameters of the Teacher encoder are updated by exponential moving average of Student encoder parameters during training.

Claim 8. (Teacher Joint Training with Strict-Superset Inputs)

The system of claim 1, wherein the Teacher encoder is trained jointly with the Student encoder while maintaining, for distillation steps, Teacher inputs that include a strict superset of modalities present in Student inputs.

Claim 9. (Canonical Replicator Form)

The system of claim 2, wherein the coupled ODE system comprises: dB/dt=φˉ(t)B(t)dB/dt = \bar{\varphi}(t) \cdot B(t) and dfk/dt=fk(t)(φk(t)φˉ(t))df_k/dt = f_k(t) \cdot (\varphi_k(t) - \bar{\varphi}(t)) where φk(t)\varphi_k(t) is clone fitness and φˉ(t)\bar{\varphi}(t) is mean fitness computed as a weighted sum using fk(t)f_k(t).

Claim 10. (Separated-State Growth Modulation Options)

The system of claim 2, wherein the burden-dependent growth modulation term is selected from the group consisting of logistic (1B/Kcap)(1 - B/K_{\text{cap}}), Gompertz log(Kcap/B)\log(K_{\text{cap}}/B), von Bertalanffy B1/3B^{-1/3}, and Richards growth modulation, based on configuration stored in memory.

Claim 11. (Separated-State Fixed Clones)

The system of claim 2, wherein a number of clones KK is fixed at initialization as a constructor parameter.

Claim 12. (Uncertainty Population Priors Storage)

The system of claim 3, wherein the population priors comprise per-parameter mean and standard deviation pairs stored in non-transitory memory as non-trainable values.

Claim 13. (Uncertainty Trust Function Form)

The system of claim 3, wherein the trust function is τ(t)=exp(λminttobs)\tau(t) = \exp(-\lambda \cdot \min|t - t_{\text{obs}}|) with λ\lambda being a stored hyperparameter.

Claim 14. (Oscillation Metrics)

The system of claim 3, wherein the oscillation metrics comprise at least one of: a max-min ratio over a time window, a sign change rate of derivatives over time, or high-frequency spectral power computed as spectral energy above a cutoff frequency.

Claim 15. (Severity-Graded Damping with Stored Parameters)

The system of claim 3, wherein: (a) mild severity triggers exponential moving average with a first stored smoothing parameter; (b) moderate severity triggers lowpass filtering with a second stored cutoff parameter; and (c) severe severity triggers combined exponential moving average and lowpass filtering with third stored parameters.

Claim 16. (Example Damping Embodiment)

The system of claim 15, wherein the first stored smoothing parameter is 0.4, the second stored cutoff parameter is 0.1, and the third stored parameters comprise exponential moving average smoothing parameter 0.2 and lowpass cutoff parameter 0.05.

Claim 17. (FiLM Conditioning in Simulation System)

The system of claim 2, wherein generating dynamical parameters uses a hypernetwork comprising Feature-wise Linear Modulation (FiLM) layers conditioned on learned cancer-type embeddings and applying modulation as h=h(1+γ)+βh' = h \odot (1 + \gamma) + \beta.

Claim 18. (FiLM Identity Initialization)

The system of claim 17, wherein γ\gamma and β\beta are produced by linear projections initialized to zero so that initial modulation is h=hh' = h.

Claim 19. (FiLM Conditioning in Uncertainty System)

The system of claim 3, wherein the model-generated parameters blended with population priors are produced by a hypernetwork comprising FiLM layers conditioned on learned cancer-type embeddings.

Claim 20. (Per-Sample Gating in Simulation System)

The system of claim 2, wherein the system fuses molecular features and histopathological features by computing per-sample softmax gate weights from the features and a cancer-type embedding, and computing a weighted sum using the gate weights.

Claim 21. (Per-Sample Gating in Uncertainty System)

The system of claim 3, wherein the dynamical model parameters for ensemble members are produced from fused features computed by per-sample softmax gating over molecular features and histopathological features conditioned on a cancer-type embedding.

Claim 22. (Immune Dynamics)

The system of claim 2, wherein immune effects are computed using separate innate and adaptive immune compartments comprising NK cell dynamics with immediate response and T-cell dynamics with delayed ramp activation, and wherein immune effects are modulated by checkpoint evasion and an externally set checkpoint blockade parameter.

Claim 23. (Selective Domain Calibration)

The system of claim 1, further comprising applying adaptive batch normalization to selected latent groups while bypassing proliferation-linked latent groups to preserve proliferation statistics across domains.

Claim 24. (Selective Domain Calibration Feeding Simulation)

The system of claim 2, wherein latent representations used to generate dynamical parameters are selectively domain-calibrated by applying adaptive normalization to a subset of latent groups while bypassing proliferation-linked latent groups.

Claim 25. (Methylation Variance Coupling)

The system of claim 3, wherein the dynamical model is a stochastic differential equation and the system computes methylation latent variance and provides the variance as input to a volatility parameter generation head.

Claim 26. (Group-Wise KL Constraint)

The system of claim 1, further comprising applying a minimum KL capacity constraint per latent group computed as max(0,  CtargetgKLg)\max(0,\; C_{\text{target}}^g - \overline{\text{KL}}^g) with stored per-group targets.

Claim 27. (Cross-Reconstruction)

The system of claim 1, further comprising a cross-reconstruction loss that reconstructs a modality using shared latents with private latents explicitly zeroed.

Claim 28. (Method - PESD)

A computer-implemented method comprising performing the steps of claim 1.

Claim 29. (Method - Separated-State Dynamics)

A computer-implemented method comprising performing the steps of claim 2.

Claim 30. (Method - Uncertainty Quantification)

A computer-implemented method comprising performing the steps of claim 3.

Claim 31. (CRM - PESD)

A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the method of claim 28.

Claim 32. (CRM - Separated-State)

A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the method of claim 29.

Claim 33. (CRM - Uncertainty)

A non-transitory computer-readable medium storing instructions that, when executed by one or more processors, cause the one or more processors to perform the method of claim 30.

VII.Abstract of the Disclosure

Systems and methods for uncertainty-calibrated missing modality handling, identifiable separated-state tumor dynamics, and multi-layer uncertainty quantification in oncology trajectory simulation are disclosed. Missing modalities are replaced with learned embeddings; a Student encoder is trained to match a Teacher posterior via KL divergence while a one-sided calibration loss penalizes the Student when its log-variance is less than the Teacher log-variance, thereby training the Student to increase uncertainty toward the Teacher uncertainty or above when modalities are missing.

Clonal dynamics use coupled burden and fraction ODEs with replicator dynamics preserving simplex constraints by construction. Uncertainty quantification combines ensemble parameter generation with population priors, parametric time-dependent trust decay blending predictions toward a population baseline when extrapolating, and severity-graded oscillation damping as post-processing.

Cancer-type conditional parameter generation uses FiLM layers conditioned on learned cancer-type embeddings, with identity initialization via zero-initialized FiLM projections. Optional modules include adaptive multi-modal gating, immune compartment separation with checkpoint modulation, selective domain calibration bypassing proliferation-linked latents, methylation variance coupling to SDE volatility, and training constraints preventing latent collapse.

[End of Application]

Interested in licensing this technology?

Contact us to discuss partnership and licensing opportunities for the Uncertainty-Calibrated Simulation System.