The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

10:15 AM - 10:30 AM JST(1:15 AM - 1:30 AM UTC)

[2L1-GS-10t-06]Towards End-to-End CWE Autofix: A SARIF-Driven LLM Pipeline for Repository-level Security Repair

〇Shigeki Nagaya¹ (1. Neural Group Inc.)

[[online]]

Keywords:

Static Application Security Testing,LLM as a Judgemet,Common Weakness Enumerate,AI Security,OWASP Top10 for LLM

Recent advances in large language models (LLMs) enable automated security vulnerability detection and repair. However, existing academic evaluation focuses on isolated code snippets, while proprietary industrial systems operate at repository scale. This lack of an open, reproducible end-to-end Autofix architecture is a gap in the literature.
This paper reports on a SARIF-driven LLM Autofix pipeline that performs repository-scale vulnerability analysis, automated repair, and pull-request generation. The system uses LLM-based security review to produce structured SARIF reports, applies fixes directly to source files, and generates traceable pull requests. It also integrates "LLM-as-a-Judge" validation to assess detection and repair quality using structured context and before-after code comparisons.
Our experience confirms the feasibility of fully LLM-driven Autofix workflows, highlighting benefits and limitations, such as correlated model errors and judge independence challenges. We discuss implications for reproducible security Autofix research and outline future systematic evaluation directions.

Back to Session information