|
|
Beyond accuracy metrics: out-of-distribution for determining reliability
of segmentation models in medical image segmentation for CT
Aneesh Rangnekar, Harini Veeraraghavan
Under preparation.
This work introduces a machine learning based approach for detecting out-of-distribution (OOD) cases using hierarchical transformer (Swin) features, with focus on 3D lung cancer scans. We benchmark three leading self-supervised learning (SSL) methods to demonstrate the feasibility of our performance — SimMIM, iBOT, and SMIT, pretrained then fine-tuned on identical datasets for consistency. Standard confidence-based OOD methods (e.g., softmax scores) and conventional radiomics often fail under real-world distribution shifts and concept drift in medical images. Our feature-level OOD classifier, using a random forest trained with outlier exposure, outperforms the traditional approaches under diverse cases like pulmonary embolism, COVID-19, and unrelated organ scans. This work is tailored towards safer deployment of segmentation models in diverse and unpredictable clinical environments for diagnostic radiology.
|
|
|
Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
Jue Jiang, Aneesh Rangnekar, Chloe Choi, Harini Veeraraghavan
arXiv
SMART is a self-supervised framework designed for medical image analysis using Swin Transformers, which traditionally lack the [CLS] token used in standard masked image modeling. We addressed this drawback by introducing a semantic attention module that performs global attention and identifies informative regions for masking during pretraining. To improve training stability and generalization, we incorporate a noise-regularized momentum teacher in a co-distillation setup. This approach yields a stronger foundation model, enabling more effective downstream tasks such as tumor and organ segmentation and zero-shot localization via attention maps with the [CLS] token.
|
|
|
Pretrained hybrid transformer for generalizable cardiac substructures segmentation from contrast and non-contrast CTs in lung and breast cancers
Aneesh Rangnekar, Nikhil Mankuzhy, Jonas Willmann, Chloe Choi, Abraham Wu, Maria Thor, Andreas Rimner, Harini Veeraraghavan
arXiv
We fine-tuned a pretrained transformer with a convolution network decoder for segmenting cardiac substructures from both contrast and non-contrast CT scans, with an emphasis on reducing the need for extensive annotated data. The transformer encoder was pretrained on in-the-wild large CT dataset and then adapted with fine-tuning for radiotherapy planning in lung cancer patients, with zero-shot application on breast cancer patients. Extensive experiments demonstrated strong generalization across imaging modalities, clinical sites, and patient positionings.
|
|
|
brat: Aligned Multi-View Embeddings for Brain MRI Analysis
Maxime Kayser, Maksim Gridnev, Wanting Wang, Max Bain, Aneesh Rangnekar, Avijit Chatterjee, Aleksandr Petrov, Harini Veeraraghavan,Nathaniel C. Swinburne
Paper
We developed brat
(Brain Report Alignment Transformer), a new AI framework designed to align brain MRI scans with clinical radiology reports. We curated one of the largest dataset of its kind for this purpose - over 75,000 3D MRIs with paired radiologist reports. brat learned rich, multi-view representations of the complex brain anatomy using self-supervised pretraining. We proposed a novel pairwise view alignment mechanism and a diversity-promoting loss based on Determinantal Point Processes. brat significantly outperforms previous methods on tasks like image-text retrieval, tumor segmentation, and Alzheimer’s classification, and also enables high-quality automatic report generation from MRIs using language models
|
|
|
Improving ovarian cancer segmentation accuracy with transformers through AI-guided labeling
Aneesh Rangnekar, Kevin M. Boehm, Emily A. Aherne, Ines Nikolovski, Natalie Gangai, Ying Liu, Dimitry Zamarin, Kara Roche, Sohrab Shah, Yulia Lakhman, Harini Veeraraghavan
arXiv
We formulated an AI-guided labeling pipeline using a multi-resolution residual 2D network trained on partially segmented CTs (adnexal tumors and omental implants) to assist radiologists in refining annotations. The enhanced dataset fine-tuned two transformer architectures, namely SMIT and Swin UNETR, and is evaluated across 71 multi-institutional 3D CT scans. Training with AI-refined labels yielded statistically significant improvements across all metrics for both models. Our approach emphasized efficient dataset curation, reducing annotation workload for radiologists while improving model performance and radiomics reproducibility in ovarian cancer.
|
|
|
Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets
Aneesh Rangnekar, Nishant Nadkarni, Jue Jiang, Harini Veeraraghavan
SPIE Medical Imaging, 2025
arXiv
We investigated applications of foundation models on lung tumor segmentation across mixed-domain CT datasets, encompassing diverse acquisition protocols and institutions, as a stepping stone towards task generalization. Our study evaluated segmentation performance and uncertainty estimation using Monte Carlo dropout, deep ensembles, and test-time augmentation. We demonstrated that fast, entropy-based metrics and volumetric occupancy can effectively track model performance under domain shift, offering introductory practical tools to evaluate segmentation trustworthiness in mixed-domain clinical settings.
|
|
|
Self‐supervised learning improves robustness of deep learning lung tumor segmentation models to CT imaging differences
Jue Jiang, Aneesh Rangnekar, Harini Veeraraghavan
Medical Physics, 2025
arXiv
We investigated the benefits of large-scale in-the-wild self-supervised pretraining on uncurated CT scans to improve robustness in tumor segmentatio, with focus on lung cancer tumors. When fine-tuned on smaller curated NSCLC datasets, Swin transformer models pretrained on this diverse unlabeled data, consistently outperformed both self-pretrained Swin and Vision Transformer counterparts across varied CT acquisition protocols. Masked image prediction proved more effective than contrastive learning at capturing local anatomical structure, enhancing accuracy and feature reuse. Our study directly addresses whether self-supervision on noisy, heterogeneous CT data improves generalization to real-world distribution shifts — a critical gap underaddressed by prior research.
|
|
|
Co-distilled attention guided masked image modeling with noisy teacher for self-supervised learning on medical images
Jue Jiang, Aneesh Rangnekar, Harini Veeraraghavan
Medical Imaging with Deep Learning, 2025
Paper
We developed DAGMaN, a novel self-supervised learning framework that combined attention-guided masked image modeling and noisy teacher co-distillation for medical imaging. We integrated global semantic attention into Swin transformers with vision transformer based multi-head self-attention blocks and improved feature learning and attention diversity by noise injection in the teacher’s pipeline. Our approach outperformed prior methods on multiple medical tasks, including lung nodule classification, tumor segmentation, immunotherapy response prediction, and organ clustering. It also achieved high accuracy in few-shot settings, while enabling better interpretability via attention map visualization, a previously unavaible feature for Swin transformers.
|
|
|
Semantic Segmentation with Active Semi-Supervised Learning
Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman
WACV, 2023
arXiv / code
We developed S4AL, a hybrid approach that combined semi-supervised and active learning, for data-efficient semantic segmentation. It uses a teacher–student pseudo-labeling framework to generate region-level acquisition scores, enabling querying and annotating of only the most informative regions rather than full images. We introduced two regularization techniques, confidence weighting and balanced ClassMix, that helped mitigate class imbalance and enhance quality of the acquisition metric. S4AL achieved over 95% of full-dataset performance using less than 17% of pixel annotations on CamVid and CityScapes datasets.
|
|
|
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman
BMVC, 2022
arXiv
We developed S4AL+, a hybrid framework that combines semi-supervised learning with active learning for semantic segmentation, aiming to reduce annotation costs. We replaced the conventional mean-teacher approach with self-training using noisy pseudo-labels and added a contrastive head for better feature learning of the classes. On benchmarks CamVid and CityScapes, S4AL+ achieved over 95% of full-label performance using just 12–15% of labeled data, outperforming state-of-the-art approaches at the time.
|
|
|
SpecAL: Towards Active Learning for Semantic Segmentation of Hyperspectral Imagery
Aneesh Rangnekar, Emmett Ientilucci, Christopher Kanan, Matthew Hoffman
DDDAS, 2022
Paper
We proposed SpecAL, an active learning framework for semantic segmentation of hyperspectral imagery, reducing the need for extensive efforts for labeled data. Using the AeroRIT dataset, we combine data-efficient neural network design with self-supervised learning and batch-ensembles based uncertainty acquisition to iteratively improve performance. Our method design achieved oracle level segmentation performance using only 30% of the labeled data, demonstrating a scalable path for annotation-efficient hyperspectral analysis.
|
|
|
AeroRIT: A New Scene for Hyperspectral Image Analysis
Aneesh Rangnekar et al.
IEEE TGRS, 2020
arXiv / code
We introduced AeroRIT, a new aerial hyperspectral dataset designed specifically to support convolutional neural network (CNN) training for scene understanding. Unlike typical airborne hyperspectral datasets focused on vegetation or roads, AeroRIT includes buildings and cars, expanding the domain diversity. We design and benchmark several CNN architectures on AeroRIT, thoroughly evaluating classification accuracy, spatial consistency, and generalization across scenes.
|
|