From Reports to Ontologies: Ontology-Guided Representation Learning for 12-Lead ECG

Abstract

The 12-lead electrocardiogram (ECG) is a quasi-periodic, multi-channel signal with diagnostic content spanning timescales from millisecond waveform morphology to multi-second rhythm dynamics. Existing ECG representation learning relies on signal-only self-supervision or ECG-text multimodal alignment, neither of which exploits the structured diagnostic codes attached to every clinical recording. We present MAR-ECG, an ontology-guided masked autoregressive framework that supervises the encoder with a curated 40-node SNOMED-CT cardiac graph through graph alignment, eliminating the need for paired clinical reports. MAR-ECG combines two complementary objectives. First, graph-smoothed contrastive learning (GSCL) anchors the encoder's rhythm-pooled features to the SNOMED graph, softening supervision targets by ontology distance so that clinically related concepts reinforce one another rather than function as hard negatives. Second, multi-scale physiological supervision complements GSCL with signal-derived patch auxiliaries that target rhythm-physiology statistics extracted automatically from the input, extending supervision beyond the patch tier at no annotation cost. Pretrained on 40K publicly available 12-lead ECGs with SNOMED-CT codes and evaluated by frozen linear probing on five downstream classification benchmarks, MAR-ECG consistently outperforms a strong masked-autoregressive baseline, with mean gains in the low-label regime. Despite the absence of paired clinical text, MAR-ECG achieves performance competitive with state-of-the-art multimodal ECG-text methods.

0

Turn this paper into a full lesson

ArcXiv compiles a staged curriculum from this paper: 8-12 lessons across beginner → advanced, synthesised section guides, visuals, flashcards, a quiz, exercises, and on-demand deep dives per section. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…