Presentation Information

[1G5-OS-13c-06]Web Article Popularity Prediction Using Wikipedia Pageviews as a Proxy for Public Interest

Kento Haneda1, 〇Takaki Otake2, Kazuaki Hanawa2, Taichi Higuchi2, Manabu Yamaguchi2 (1. Tohoku University, 2. KODANSHA LTD. IT Strategy Planning Division)

Keywords:

Natural Language Processing,Wikipedia,Industrial Applications,Publishing

Article headlines are a key driver of pageviews (PV) in web media. Predicting PV from candidate headlines using machine learning can reduce editors’ workload in headline writing and support editorial decisions such as identifying themes that are likely to attract high traffic. While prior approaches primarily rely on static textual information, we focus on the fact that PV is strongly shaped by dynamic societal trends (e.g., entertainment and international affairs) and aim to improve PV prediction by incorporating external signals of public interest at the time of publication. Concretely, we use daily Wikipedia page views as a proxy for societal attention and augment the predictor’s input by appending the titles of the most-viewed Wikipedia pages to the candidate headline, thereby reflecting contemporaneous public interest in the model’s representation. Experiments on real-world web media data show that the proposed method consistently improves average prediction accuracy over baselines.

Comment

To browse or post comments, you must log in.Log in