Presentation Information

[5Yin-A-27]Corporate Feature Extraction from Annual Securities Reports Using Clustering and LLM Evaluation

〇Yuichiro Okugawa1, Toshimitsu Tanaka1, Tomomi Nagao1, Akihiro Katsuno1, Hiroaki Kawata1 (1. NTT, Inc.)

Keywords:

Extraction of Corporate Characteristics,Securities Reports,Clustering

This study proposes a framework for extracting sentences that are both distinctive and important to individual companies from the “Business Overview” section of annual securities reports. Distinctiveness is quantified independently using kNN-based distance, while importance is assessed using a large language model (LLM). The two measures are integrated through a multiplicative model. Comparison with human evaluations shows that importance exhibits a statistically significant positive correlation, whereas distinctiveness demonstrates a negative correlation. As a result, simple multiplicative integration does not align with human judgment. These findings suggest that distinctiveness and importance possess fundamentally different structural properties.

Comment

To browse or post comments, you must log in.Log in