The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

10:15 AM - 10:30 AM JST(1:15 AM - 1:30 AM UTC)

[2E1-GS-5b-06]Experimental Protocol Design for Desktop AI AgentsAuditability via Grounded Perception-Decision-Action Evidence Chains and Error Injection

〇Yuya Sasaki^1,3, Akifumi Ito², Satoshi Kurihara² (1. ITOCHU Techno-Solutions Corporation, 2. Keio Univerity, 3. Keio AI Center)

Keywords:

AI Agent,Proactive Support,Human-AI cooperation

Desktop AI agents that operate GUIs across browser, mail, documents, and files are rapidly emerging, but evaluation remains fragile because of model nondeterminism, environment drift, and poor instrumentation. We propose a protocol template for controlled studies that combines (i) task cards with risk tags and executable or semi-executable success checks, (ii) an environment manifest with reset procedures, and (iii) an auditability-oriented evidence chain that links grounded observation traces to decisions and UI actions through explicit IDs. We define tiered traceability metrics (T1-T4) and a comparable error-injection procedure with detection definitions and stop rules, enabling computable detection-latency analysis. As reusable artifacts, we release a checklist, JSON templates, an event schema, and reference scripts for validation and metric computation, providing scaffolding for reproducible comparisons of desktop agent designs.

Comment

To browse or post comments, you must log in.Log in

Back to Session information