A Cross-Modal Benchmark of Deep Learning Encoders for Automated Liver Segmentation

Rahman, Redwan; Nahar, Kazi Toushia

dc.contributor.author	Rahman, Redwan
dc.contributor.author	Nahar, Kazi Toushia
dc.date.accessioned	2026-04-09T07:16:45Z
dc.date.available	2026-04-09T07:16:45Z
dc.date.issued	2025-12
dc.identifier.uri	https://ar.iub.edu.bd/handle/11348/1077
dc.description.abstract	Many clinical processes, including the scheduling of liver transplants, volumetry, and management of chronic diseases, including cirrhosis and fibrosis, rely on accurate segmentation of the liver. These critical clinical situations, including incorrect tracking of disease progression, pose significant risks to patient safety when there is an incorrect segmentation. The occurrence of the domain shift phenomenon is when models that have been trained on Magnetic Resonance Imaging (MRI) fail to generalize to Computed Tomography (CT). This problem is a serious limitation to the clinical utility of such systems in healthcare facilities despite the fact that Deep Learning (DL) has demonstrated great potential in automating the task. Despite the regular introduction of the state-of-the-art methods of Unsupervised Domain Adaptation (UDA), there is not much information regarding the natural cross-modality robustness of simplistic encoder structures. To compare adaptability, a baseline was established using four popular encoder backbones in a standard U-Net structure. These backbones include ResNet18, EfficientNetB3, DenseNet121, and MobileNetV2. The LiverHccSeg data (arterial-phase T1-weighted MRI) was used to train models to evaluate the differences in model-specific performance. They were subsequently tested on three external sets which represented varying levels of domain shift: CHAOS (MRI), 3D-IRCADb-01 and SLIVER07 (CT). The systematic analysis indicates that in-domain accuracy does not define cross-modality stability. The best performance on the in-domain test set (Dice Similarity Coefficient [DSC] = 0.928) results in poor performance on generalization tasks, though DenseNet121 was best. Conversely, EfficientNetB3 was the most robust design, and delivered the best performance on the external MRI protocol (CHAOS, DSC 0.852) and cross-modality CT data (3D-IRCADb-01, DSC 0.825). It implies that compound scaling principles improve the achievement of cross-modality structural representations that are essential in handling various liver diseases. ResNet18, in turn, demonstrated the lowest cross-modality transfer (DSC 0.769-0.788 on CT), which indicates that it is susceptible to training modality specific overfitting intensity patterns. Furthermore, the lightweight MobileNetV2 showed comparable generalization (DSC ∼0.81 on CT), demonstrating that strong performance and computational efficiency are compatible. The use of 2D slice processing, which limits the acquisition of complete 3D volumetric context, and the comparatively small size of the training cohort (n = 17) are the main limitations of this study. In order to further reduce the patient risks related to domain shift, future research should concentrate on validating our results on larger, multi-center datasets and investigating 3D structures. In the end, these findings suggest that choosing the appropriate encoder is an important initial step in developing cross-modal segmentation systems that are reliable for widespread clinical application.	en_US
dc.language.iso	en	en_US
dc.publisher	IUB	en_US
dc.title	A Cross-Modal Benchmark of Deep Learning Encoders for Automated Liver Segmentation	en_US
dc.type	Thesis	en_US

Files in this item

Name:: Thesis_Report_Draft_9[Final] ...
Size:: 3.241Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Undergraduate Thesis [23]
By CSE Department

Show simple item record