Transferv1.0

Domain Separation Network

Translates mouse model data into human-relevant predictions

Architecture
Adversarial Domain Separation with Conditional Prior
Domain Accuracy
~0.55

Near-chance domain discrimination (target: ~0.50)

Target: < 0.60
Cancer Subtype Acc
> 0.90

Shared encoder preserves biological utility

Target: > 0.85
Meth Imputation r
0.862

Correlation between imputed and real methylation latents

Target: > 0.80
CNV Imputation r
0.967

Correlation between imputed and real CNV latents

Target: > 0.90

Overview

When we use preclinical mouse model (PDX) data, this model strips out the mouse-specific biological signal and keeps only the tumor biology that transfers to humans. It learns to separate what's universal about the tumor from what's an artifact of the host species — so drug responses observed in mice can inform predictions for patients. It also fills in missing data types (like methylation) that aren't typically measured in mouse experiments.

Inputs

1 inputs
z_rna201

RNA-derived portion of VAE latent (z_prolif + z_pathway)

Source: vae

Outputs

3 outputs
z_shared201

Domain-invariant tumor biology representation

Consumers: hypernet
z_meth_imputed48

Imputed methylation latent from ConditionalPrior

Consumers: hypernet
z_cnv_imputed32

Imputed CNV latent from ConditionalPrior

Consumers: hypernet

Mathematical Formulation

Shared Encoding

Residual shared encoder with learnable scale α

Adversarial Loss

Domain confusion via gradient reversal

Imputation

ConditionalPrior always-samples (never collapses to mean)

Key Features

  • SharedEncoder: 201→128→128→201 (residual, learnable scale init 0.1)
  • PrivateEncoder: 201→256→256→256→128 (high capacity stroma sponge)
  • Gradient Reversal Layer (GRL): lambda ramp 0→2.0 over 30 epochs
  • ConditionalPrior: always-samples to prevent variance collapse (Patent 2)
  • Anchor standardization: scaler fit on TCGA only, applied to both domains

Key Innovations

  • 1Domain-invariant tumor biology extraction from PDX data
  • 2ConditionalPrior always-samples (prevents metabolic scaling collapse)
  • 3Three critical fixes: biological utility, anchor standardization, private branch isolation
  • 4Optional CDAN for multi-cancer PDX alignment
  • 5Pathway-aware GRL weights (Patent 5): intrinsic=1.0, TME=0.3

Hyperparameters

Shared Encoder LR
1e-3
Discriminator LR
2e-3
GRL Lambda Ramp
0→2.0 over 30 epochs
Batch Size
64 (balanced domain sampling)
Warmup
10 epochs (discriminator only)
Main Training
200 epochs

Training Details

Trained on 128 PDX RNA-seq samples (prostate, GSE184427) + 9,415 TCGA. Phase 1: discriminator warmup (10 epochs). Phase 2: main training (200 epochs) with GRL ramp, separate optimizers, cosine annealing. All claims validated: domain confusion, biological utility, imputation quality.

Pipeline Position

Domain Separation Network