LoRA-Tuned Segment Anything Model for Few-Shot Polyp Segmentation in Colonoscopy Images
DOI:
https://doi.org/10.64149/J.Carcinog.24.3.372-386Keywords:
Medical image segmentation; few-shot learning; Segment Anything Model (SAM); LoRA adapters; polyp segmentation; vision foundation modelsAbstract
Colorectal cancer is a leading cause of cancer mortality, and automated polyp segmentation in colonoscopy images is vital for early detection. However, deep segmentation models often require large, annotated datasets, which are scarce in medicine. We explore whether vision foundation models like the Segment Anything Model (SAM) can be adapted for accurate polyp segmentation with minimal labeled samples. We leverage SAM’s pre-trained ViT-H encoder, injecting lightweight LoRA adapters for fine-tuning on Kvasir-SEG (1,000 polyp images) with only 5–50 labeled examples. We also evaluated zero-shot MedSAM (a SAM model fine-tuned on 1.57M medical images) and a classic CNN (UNet++) for comparison. With only 2–5% of labels, our SAM-LoRA approach achieves a Dice score of 0.88–0.90, approaching the 0.91 Dice of a fully-supervised UNet++ trained on 100% data. It significantly outperforms both UNet++ with 20 labels (0.73 Dice) and zero-shot SAM variants. Adapter-tuned SAM retains strong segmentation capability with minimal annotated images, offering a compelling solution to label scarcity. We discuss failure modes on tiny polyps and the trade-off between SAM’s higher computational cost and its superior data efficiency. These findings highlight the promise of foundation models in few-shot medical image segmentation.




