A Baseline Analysis of Cross-Modal Liver Tumor Segmentation and the Role of Frozen Encoder
Abstract
Accurate liver tumor segmentation is critical for surgical planning, volumetry, and treatment
monitoring across diverse liver pathologies and clinical applications. Both Computed
tomography (CT) and Magnetic resonance imaging (MRI) serve as essential modalities
in clinical practice, yet segmentation models trained on one modality typically fail when
applied to the other. This limitation reflects a fundamental gap in the understanding of
why complex architectural innovations are necessary for cross-modal robustness. Most
literature proposes sophisticated solutions without first establishing what simpler methods
achieve or where they encounter irreducible obstacles.
This thesis provides a systematic baseline analysis of cross-modal liver and tumor seg-
mentation, deliberately adopting simple approaches to characterize their capabilities and
limitations. The study employs a frozen ResNet18 encoder combined with a U-Net decoder
within a two-stage pipeline that first segments the liver, then detects lesions within the
hepatic region. This straightforward architecture is evaluated across five public datasets
spanning both modalities: LiverHCCSeg and CHAOS for MRI, and LiTS, 3D-IRCADb-01,
and SLiver07 for CT.
Collections
- Undergraduate Thesis [26]
