[3Q1-IS-2a-01]Exploring Challenges in Extracting Structured Knowledge from Financial Documents
〇Rungsiman Nararatwong1, Natthawut Kertkeidkachorn2, Ryutaro Ichise3,1(1. National Institute of Advanced Industrial Science and Technology, 2. Japan Advanced Institute of Science and Technology, 3. Tokyo Institute of Technology)
In 2018, the U.S. Securities and Exchange Commission adopted amendments requiring the use of Inline XBRL, a structured data language mandating financial documents to be both human-readable and machine-readable. However, this implementation does not include older filings made by and for humans, leading to large pieces of information missing from the structured data. This paper discusses the challenges in extracting facts from these documents, followed by experiments and analyses on entity-linking approaches. The results highlight the complexity of the problem, warranting future research on the topic.
