The 40th Annual Conference of the Japanese Society for Artificial Intelligence, 2026

Presentation Information

9:45 AM - 10:00 AM JST(12:45 AM - 1:00 AM UTC)

[2H1-OS-28-04]Spatial Tokenization: A Transformer-Based Approach to Predicting Pedestrian Trips from Building-Use Composition around Railway Stations

〇Shun Nakayama^1,3, Takahiro Kanamori^2,3, Wanglin Yan³ (1. Senshu University, 2. PASCO Corporation, 3. Keio University)

Keywords:

Spatial Tokenization,Self-Attention,Transformer,People Flow,Urban Form

Representing irregular geospatial data as input to deep learning models remains a fundamental challenge in GeoAI. This study proposes spatial tokenization, a method that converts the surroundings of any point of interest into a token sequence using concentric ring buffers. As a case study, we predict pedestrian trip counts from building-use composition around railway stations. Ring buffers at 100 m intervals partition each station's surroundings into eight distance bands, each represented by an 18-dimensional building-use feature vector serving as a token for a SpatialTransformer model. Distance-based positional encoding and multi-head self-attention learn nonlinear inter-band dependencies without pre-specified interaction terms. Applied to 507 Tokyo stations with GPS-derived pedestrian trips as the target, SpatialTransformer achieved a test R² of 0.7883, outperforming OLS (0.5781), Ridge regression (0.6084), and geographically weighted regression (0.6595). Spatial tokenization preserves the ordinal structure of distance bands while enabling data-driven discovery of inter-band relationships, offering a generalizable framework for distance-centered spatial analysis.

Back to Session information