Back to Pipeline Overview

V1 (Static)v1.0

CausalDriver-GAT

Identifies which mutations are actually driving the cancer

Architecture

GATv2-based Graph Neural Network

AUROC

0.9334

Area under ROC curve for driver classification (v5.10 retrained)

Target: > 0.90

AUPRC

0.9902

Area under precision-recall curve (v5.10 retrained)

Target: > 0.90

Top-10 Recall

0.847

Known drivers in top 10 predictions

Target: > 0.80

Context Utilization

High

Attention weights show z_ctx influence

Target: Qualitative

Overview

A graph neural network that identifies which mutations are actually driving the cancer versus which are passengers along for the ride. It uses the biological interaction network between genes — which proteins talk to which — combined with the specific tumor context to rank every mutated gene by how likely it is to be a true driver. It also classifies each driver as targetable, resistance-causing, or currently undruggable.

Inputs

3 inputs

z_ctx_clean31

Proliferation-free biological context from VAE

Source: vae

Gene Features4 per gene

Mutation status, expression, CNV, centrality

Source: TCGA + PPI graph

Gene Graph~3,000 genes

COSMIC+Reactome nodes, SIGNOR+STRING edges (conf>0.9)

Source: COSMIC/SIGNOR/STRING

Outputs

3 outputs

driver_prob[N]

Per-gene driver probability scores

Consumers: txresponse

top_k_embedding[50, 64]

Embeddings of top 50 predicted drivers

Consumers: txresponse

actionability[N, 3]

Targetable, resistance, undruggable classification

Mathematical Formulation

GATv2 Attention

Dynamic attention coefficients

FiLM Conditioning

Context modulation of node features

Message Passing

Neighborhood aggregation

Key Features

GATv2 attention for dynamic edge weighting
Context injection via FiLM conditioning on z_ctx_clean
Matches against 633 IntOGen + 95 COSMIC known cancer drivers
Multi-head attention (4 heads) with residual connections
Hierarchical message passing (3 layers)
Actionability classification head

Key Innovations

1Patient-specific driver ranking using tumor context
2Pathway-context-aware driver activation assessment per patient
3Integration of biological context with graph structure
4Actionability-aware predictions for clinical utility
5Robust to graph noise via attention-based aggregation

Hyperparameters

Hidden Dim

64

Attention Heads

4

GNN Layers

2

Dropout

0.1

Learning Rate

5e-4

Top-k Selection

50

Training Details

Trained on 4,479 IntOGen entries (~568 drivers). Retrained on v5.10 latents. Accuracy 0.9591, best at epoch 82/100. Node features: [mutation, CNV, expression, essentiality] + z_ctx(31d).

Pipeline Position

Multi-modal VAE

CausalDriver-GAT

TxResponse