Presentation Information
[1Yin-A-52]Word2Vec in Hyperbolic SpaceWord Co-occurrence and Semantic Hierarcies
〇Tatsuki Ebisawa1, Mahito Sugiyama1 (1. National Institute of Informatics)
Keywords:
Machine learning,Natural language processing,Manifold
It is well known that hyperbolic space exhibits exponential volume growth, making it suitable for embedding hierarchical structures into a continuous space. In this study, in order to map hierarchical structures in word semantics onto the structure of an embedding space, we implement Word2Vec on the Poincaré ball, which is a model of hyperbolic space, and compare it with an implementation in Euclidean space. The model is trained using only textual data, without incorporating any external structured information.As evaluation metrics, we use the norm of each embedding and the distance between the embedding of a hypernym and the mean vector obtained from the embeddings of its hyponyms, which are lower in the semantic abstraction hierarchy. These metrics are used to verify the effectiveness of the proposed method. In addition, training is conducted while varying the word frequency.The results show that words with similar frequencies tend to have smaller norms on the Poincaré ball, regardless of their level of semantic abstraction. This finding suggests that when semantic abstraction is correlated with word frequency, embeddings of more abstract words tend to be located closer to the origin.
