|
|
brat: Aligned Multi-View Embeddings for Brain MRI Analysis
Maxime Kayser, Maksim Gridnev, Wanting Wang, Max Bain, Aneesh Rangnekar, Avijit Chatterjee, Aleksandr Petrov, Harini Veeraraghavan, Nathaniel C. Swinburne
Winter Conference on Applications of Computer Vision (WACV), 2026
Paper
We developed brat (Brain Report Alignment Transformer), trained on 75k MRI–report pairs with a pairwise view alignment and diversity-promoting loss. Leveraging these mechanisms, brat outperforms prior methods on image-text retrieval, tumor segmentation, and Alzheimer’s classification, and also enables high-quality automatic report generation from MRIs using language models.
|
|
|
Random forest-based out-of-distribution detection for robust lung cancer segmentation
Aneesh Rangnekar, Harini Veeraraghavan
arXiv
Accurate detection and segmentation of cancerous lesions from computed tomography (CT) scans is essential for automated treatment planning and cancer treatment response assessment. We proposed RF-Deep, a random forest classifier that utilizes deep features from a pretrained transformer encoder of the segmentation model to detect OOD scans and enhance segmentation reliability. RF-Deep achieved strong detection performance (with FPR95 < 0.1% on far-OOD cases), outperforming established OOD approaches.
|
|
|
Pretrained hybrid transformer for generalizable cardiac substructures segmentation from contrast and non-contrast CTs in lung and breast cancers
Aneesh Rangnekar, Nikhil Mankuzhy, Jonas Willmann, Chloe Choi, Abraham Wu, Maria Thor, Andreas Rimner, Harini Veeraraghavan
arXiv
We fine-tuned a pretrained transformer with a convolution decoder for cardiac substructures segmentation in contrast and non-contrast CT scans, with an emphasis on reducing the need for extensive annotated data. Pretrained on large in-the-wild CT dataset, the model was adapted for radiotherapy planning in lung cancer patients, with zero-shot application on breast cancer patients. Extensive experiments demonstrated strong generalization across imaging modalities, clinical sites, and patient positioning.
|
|
|
Self-supervised learning improves robustness of deep learning lung tumor segmentation models to CT imaging differences
Jue Jiang, Aneesh Rangnekar, Harini Veeraraghavan
Medical Physics, 2025
arXiv / Publication
We investigated the benefits of large-scale in-the-wild self-supervised pretraining on uncurated CT scans to improve robustness in tumor segmentation, with a focus on lung cancer tumors. When fine-tuned on smaller curated NSCLC datasets, Swin transformer models pretrained on this diverse unlabeled data, consistently outperformed both self-pretrained Swin and Vision Transformer counterparts across varied CT acquisition protocols. Our study directly addresses whether self-supervision on noisy, heterogeneous CT data improves generalization to real-world distribution shifts — a critical gap underaddressed by prior research.
|
|
|
Co-distilled attention guided masked image modeling with noisy teacher for self-supervised learning on medical images
Jue Jiang, Aneesh Rangnekar, Harini Veeraraghavan
Medical Imaging with Deep Learning, 2025
Paper / OpenReview
We developed DAGMaN, a novel self-supervised learning framework that combined attention-guided masked image modeling and noisy teacher co-distillation for medical imaging. We integrated global semantic attention into Swin transformers with vision transformer based multi-head self-attention blocks, and improved feature learning and attention diversity by noise injection in the teacher’s pipeline. Our approach outperformed prior methods on multiple medical tasks and achieved high accuracy in few-shot settings, while enabling better interpretability via attention map visualization, a previously unavailable feature for Swin transformers.
|
|
|
Quantifying uncertainty in lung cancer segmentation with foundation models applied to mixed-domain datasets
Aneesh Rangnekar, Nishant Nadkarni, Jue Jiang, Harini Veeraraghavan
SPIE Medical Imaging, 2025
arXiv / Publication
We investigated applications of foundation models on lung tumor segmentation across mixed-domain CT datasets, encompassing diverse acquisition protocols and institutions, as a stepping stone towards task generalization. Our study evaluated segmentation performance and uncertainty estimation using Monte Carlo dropout, deep ensembles, and test-time augmentation. We demonstrated that fast, entropy-based metrics and volumetric occupancy can effectively track model performance under domain shift, offering introductory practical tools to evaluate segmentation trustworthiness in mixed-domain clinical settings.
|
|
|
Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis
Jue Jiang, Aneesh Rangnekar, Chloe Choi, Harini Veeraraghavan
arXiv / AAPM 2024 SNAP Oral
We developed a new self-supervised framework for medical imaging with Swin Transformers, which traditionally lack the [CLS] token used with masked image modeling. We introduced a semantic attention module for global masking and a noise-regularized momentum teacher for stable co-distillation. This yielded a stronger foundation model, improving tumor/organ segmentation and enabling zero-shot localization via [CLS] global attention maps.
|
|
|
Improving ovarian cancer segmentation accuracy with transformers through AI-guided labeling
Aneesh Rangnekar, Kevin M. Boehm, Emily A. Aherne, Ines Nikolovski, Natalie Gangai, Ying Liu, Dimitry Zamarin, Kara Roche, Sohrab Shah, Yulia Lakhman, Harini Veeraraghavan
arXiv
We formulated an AI-guided labeling pipeline using a multi-resolution residual 2D network trained on partially segmented CTs (adnexal tumors and omental implants) to assist radiologists in refining annotations. The enhanced dataset fine-tuned two transformer architectures, namely SMIT and Swin UNETR, and is evaluated across 71 multi-institutional 3D CT scans. Training with AI-refined labels yielded statistically significant improvements across all metrics for both models. Our approach emphasized efficient dataset curation, reducing annotation workload for radiologists while improving model performance and radiomics reproducibility in ovarian cancer.
|
|
|
Semantic Segmentation with Active Semi-Supervised Learning
Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman
Winter Conference on Applications of Computer Vision (WACV), 2023
arXiv / Publication / Code / Poster / Presentation
We developed S4AL, a hybrid approach that combined semi-supervised and active learning, for data-efficient semantic segmentation. It uses a teacher–student pseudo-labeling framework to generate region-level acquisition scores, enabling querying and annotating of only the most informative regions rather than full images. We introduced two regularization techniques, confidence weighting and balanced ClassMix, that helped mitigate class imbalance and enhance quality of the acquisition metric. S4AL achieved over 95% of full-dataset performance using less than 17% of pixel annotations on CamVid and CityScapes datasets.
|
|
|
Semantic Segmentation with Active Semi-Supervised Representation Learning
Aneesh Rangnekar, Christopher Kanan, Matthew Hoffman
British Machine Vision Conference (BMVC), 2022
arXiv / Publication / Poster / Presentation
We developed S4AL+, a hybrid framework that combines semi-supervised learning with active learning for semantic segmentation, aiming to reduce annotation costs. We replaced the conventional mean-teacher approach with self-training using noisy pseudo-labels and added a contrastive head for better feature learning of the classes. On benchmarks CamVid and CityScapes, S4AL+ achieved over 95% of full-label performance using just 12–15% of labeled data, outperforming state-of-the-art approaches at the time.
|
|
|
Semi-Supervised Hyperspectral Object Detection Challenge Results - PBVS 2022
Aneesh Rangnekar, Zachary Mulhollan, Anthony Vodacek, Matthew Hoffman, Angel Sappa, Erik Blasch, et al.
Computer Vision and Pattern Recognition (CVPR) workshops, 2022
Publication
We curated the first true hyperspectral object detection dataset, collected from a university rooftop overlooking a 4-way intersection over three days. The dataset consists of 2890 temporally contiguous frames at ~1600×192 resolution, spanning 51 spectral bands from 400–900nm. To capture real-world variability, the training, validation, and test datasets were acquired on different days under varying weather conditions. Labels are provided for both fully-supervised and semi-supervised settings, encouraging a competition hosted at CVPR Perception Beyond the Visible Spectrum workshop.
|
|
|
SpecAL: Towards Active Learning for Semantic Segmentation of Hyperspectral Imagery
Aneesh Rangnekar, Emmett Ientilucci, Christopher Kanan, Matthew Hoffman
Dynamic Data Driven Applications Systems (DDDAS), 2022
Paper / Publication
We proposed SpecAL, an active learning framework for semantic segmentation of hyperspectral imagery, reducing the need for extensive efforts for labeled data. Using the AeroRIT dataset, we combine data-efficient neural network design with self-supervised learning and batch-ensembles based uncertainty acquisition to iteratively improve performance. Our method design achieved oracle level segmentation performance using only 30% of the labeled data, demonstrating a scalable path for annotation-efficient hyperspectral analysis.
|
|
|
Uncertainty estimation for semantic segmentation of hyperspectral imagery
Aneesh Rangnekar, Emmett Ientilucci, Christopher Kanan, Matthew Hoffman
Dynamic Data Driven Applications Systems (DDDAS), 2020
Paper / Publication
We extended deep learning for hyperspectral imaging on the AeroRIT dataset by evaluating network uncertainty within a Dynamic Data-Driven Applications Systems (DDDAS) framework. Using Deep Ensembles, Monte Carlo Dropout, and Batch Ensembles with a modified U-Net, we studied robust pixel-level segmentation under noisy, atmosphere-sensitive signals. Our results highlighted uncertainty estimation as key to guiding resource allocation and improving hyperspectral semantic segmentation.
|
|
|
AeroRIT: A New Scene for Hyperspectral Image Analysis
Aneesh Rangnekar, Nilay Mokashi, Emmett Ientilucci, Christopher Kanan, Matthew Hoffman
Transactions on Geoscience and Remote Sensing (TGRS), 2020
arXiv / Publication / Code
We introduced AeroRIT, a new aerial hyperspectral dataset designed specifically to support convolutional neural network (CNN) training for scene understanding. Unlike typical airborne hyperspectral datasets focused on vegetation or roads, AeroRIT includes buildings and cars, expanding the domain diversity. ed and benchmarked several CNN architectures on AeroRIT, thoroughly evaluating classification accuracy, spatial consistency, and generalization across scenes.
|
|
|
Tracking in aerial hyperspectral videos using deep kernelized correlation filters
Burak Uzkent, Aneesh Rangnekar, Matthew Hoffman
Transactions on Geoscience and Remote Sensing (TGRS), 2018
arXiv / Publication
We developed DeepHKCF, a hyperspectral aerial vehicle tracker that combines kernelized correlation filters (KCFs) with deep CNN features, leveraging adaptive multimodal hyperspectral sensors. A single KCF-in-multiple-ROIs strategy with efficient ROI mapping addresses low temporal resolution while enabling fast feature extraction and flexibility to integrate advanced correlation filter trackers. Experiments on DIRSIG-simulated hyperspectral videos show strong tracking performance, along with a released large-scale synthetic dataset for vehicle classification in wide-area motion imagery (WAMI).
|
|
|
Aerial vehicle tracking by adaptive fusion of hyperspectral likelihood maps
Burak Uzkent, Aneesh Rangnekar, Matthew Hoffman
Computer Vision and Pattern Recognition (CVPR) workshops, 2017
arXiv / Publication
We developed hyperspectral likelihood maps-aided tracking (HLT), a real-time hyperspectral tracking method that learns a generative target model online without offline classifiers or heavy hyperparameter tuning. It adaptively fuses likelihood maps across visible-to-infrared bands into a distinctive representation that separates foreground from background. Experiments show HLT outperforms existing fusion methods and matches state-of-the-art hyperspectral tracking frameworks.
|
|