Presentation Information

[2L1-GS-10t-06]Towards End-to-End CWE Autofix: A SARIF-Driven LLM Pipeline for Repository-level Security Repair

〇Shigeki Nagaya1 (1. Neural Group Inc.)
[[online]]

Keywords:

Static Application Security Testing,LLM as a Judgemet,Common Weakness Enumerate,AI Security,OWASP Top10 for LLM

Recent advances in large language models (LLMs) enable automated security vulnerability detection and repair. However, existing academic evaluation focuses on isolated code snippets, while proprietary industrial systems operate at repository scale. This lack of an open, reproducible end-to-end Autofix architecture is a gap in the literature.
This paper reports on a SARIF-driven LLM Autofix pipeline that performs repository-scale vulnerability analysis, automated repair, and pull-request generation. The system uses LLM-based security review to produce structured SARIF reports, applies fixes directly to source files, and generates traceable pull requests. It also integrates "LLM-as-a-Judge" validation to assess detection and repair quality using structured context and before-after code comparisons.
Our experience confirms the feasibility of fully LLM-driven Autofix workflows, highlighting benefits and limitations, such as correlated model errors and judge independence challenges. We discuss implications for reproducible security Autofix research and outline future systematic evaluation directions.

Comment

To browse or post comments, you must log in.Log in