Validation of LLM-Based Data Curation for Oncology Clinical Trials: A Review encompassing comparison with Manual Abstraction

Authors

  • Karnaditya Rana, Shashidar Reddy Abbidi, Debashree Sinha, Shalmali Joshi Author

DOI:

https://doi.org/10.66838/J.Carcinog.25.1.422-430

Abstract

The quick growth of clinical data in oncology research has made it difficult to extract and manage data which needs to be processed for clinical trials. The medical field considers manual curation to be the highest standard but its use has become restricted because of time and financial requirements and difficulties with different people judging the same material. Large Language Models (LLMs) which belong to artificial intelligence research work as effective automatic systems which can extract data from unstructured clinical documents that include electronic health records and pathology reports and clinical narratives. This review assesses how well LLMs perform data curation in oncology clinical trials by comparing their results to those of manual data curation processes. This research project has been collecting current evidence about accuracy and efficiency and scalability and reliability and cost-effectiveness of the solution. The study examined validation frameworks together with ethical matters and difficulties which arise during system implementation. Existing studies demonstrate that LLM-based approaches achieve high concordance with manual curation while dramatically reducing processing time and resource utilization. The system suffers from multiple problems which include data heterogeneity and difficulties with understanding data and bias issues and challenges from regulatory bodies. Hybrid human-AI models appear to be the most viable approach for near-term clinical integration. The review presents major advancements in research while showing existing research deficiencies and giving recommendations about how to use LLM-based systems in oncology clinical trial operations.

Downloads

Published

2026-05-19

How to Cite

Validation of LLM-Based Data Curation for Oncology Clinical Trials: A Review encompassing comparison with Manual Abstraction. (2026). Journal of Carcinogenesis, 25(1), 422-430. https://doi.org/10.66838/J.Carcinog.25.1.422-430