V1 (Static)v1.0

CausalDriver-GAT

Identifies which mutations are actually driving the cancer

Architecture
GATv2-based Graph Neural Network
AUROC
0.9334

Area under ROC curve for driver classification (v5.10 retrained)

Target: > 0.90
AUPRC
0.9902

Area under precision-recall curve (v5.10 retrained)

Target: > 0.90
Top-10 Recall
0.847

Known drivers in top 10 predictions

Target: > 0.80
Context Utilization
High

Attention weights show z_ctx influence

Target: Qualitative

Overview

A graph neural network that identifies which mutations are actually driving the cancer versus which are passengers along for the ride. It uses the biological interaction network between genes — which proteins talk to which — combined with the specific tumor context to rank every mutated gene by how likely it is to be a true driver. It also classifies each driver as targetable, resistance-causing, or currently undruggable.

Inputs

3 inputs
z_ctx_clean31

Proliferation-free biological context from VAE

Source: vae
Gene Features4 per gene

Mutation status, expression, CNV, centrality

Source: TCGA + PPI graph
Gene Graph~3,000 genes

COSMIC+Reactome nodes, SIGNOR+STRING edges (conf>0.9)

Source: COSMIC/SIGNOR/STRING

Outputs

3 outputs
driver_prob[N]

Per-gene driver probability scores

Consumers: txresponse
top_k_embedding[50, 64]

Embeddings of top 50 predicted drivers

Consumers: txresponse
actionability[N, 3]

Targetable, resistance, undruggable classification

Mathematical Formulation

GATv2 Attention

Dynamic attention coefficients

FiLM Conditioning

Context modulation of node features

Message Passing

Neighborhood aggregation

Key Features

  • GATv2 attention for dynamic edge weighting
  • Context injection via FiLM conditioning on z_ctx_clean
  • Multi-head attention (4 heads) with residual connections
  • Hierarchical message passing (3 layers)
  • Actionability classification head

Key Innovations

  • 1Patient-specific driver ranking using tumor context
  • 2Integration of biological context with graph structure
  • 3Actionability-aware predictions for clinical utility
  • 4Robust to graph noise via attention-based aggregation

Hyperparameters

Hidden Dim
64
Attention Heads
4
GNN Layers
2
Dropout
0.1
Learning Rate
5e-4
Top-k Selection
50

Training Details

Trained on 4,479 IntOGen entries (~568 drivers). Retrained on v5.10 latents. Accuracy 0.9591, best at epoch 82/100. Node features: [mutation, CNV, expression, essentiality] + z_ctx(31d).

Pipeline Position