The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

[1Yin-B-41]Prediction of Colorectal Cancer Treatment Outcomes through Time Series Analysis of Electronic Health Records Using Large Language Models

〇Ami Sakane¹, Chisato Kamiya¹, Kawaguchi Kenshi¹, Yuuki Hashimoto², Hiromasa Horiguchi², Yoshinobu Kano¹ (1. Shizuka University, 2. National Hospital Organization)

Keywords:

Electronic Medical Record,Cancer,Language Models

We address the prediction of chemotherapy treatment outcomes (progressive disease, stable disease, and partial response) at the next imaging examination, using electronic medical record data from 7,257 colorectal cancer patients (including laboratory values, chemotherapy records, and baseline characteristics). To handle missing values in time-series data, we introduce TimeLLM, a time-series imputation method leveraging large language models (LLMs), and construct 12 imputed dataset variants by combining four LLMs with two prompt configurations. We compare XGBoost-based supervised classification with GPT-OSS-based zero-shot classification using both single-day and multi-day inputs. XGBoost achieved a maximum accuracy of 77.44\% in binary classification (PD/non-PD) and 57.83\% in three-class classification. For LLM classification, multi-day inputs incorporating past clinical history improved Macro-F1 by +12.0 (three-class) and +7.8 (binary) points over single-day inputs, reaching performance comparable to XGBoost. Additionally, temporal information reduced overprediction of progressive disease (PD). Furthermore, a weighted averaging ensemble of three models (XGBoost, LLM, and baseline label persistence) improved binary Macro-F1 by +2.23 to +3.23 points over XGBoost. Feature contribution analysis showed clinically appropriate results.

Back to Session information