Presentation Information
[D-8-04]Cross-Model Comparison of SAE Feature Contributions to Output Probabilities
〇Yuta Yuzurihara1, Tomofumi Matsuzawa1, Kaiyu Suzuki1 (1. TUS)
Keywords:
Mechanistic Interpretability,LLM,Sparse Autoencoders
Mechanistic Interpretability,LLM,Sparse Autoencoders