Awesome Spatial VLMs

Spatial Intelligence in Vision–Language Models

Official site for Awesome Spatial VLMs project, a curated resource and evaluation toolkit for spatial intelligence in Vision–Language Models (VLMs).

Highlights
  • Structured taxonomy of spatial intelligence in VLMs
  • 20+ datasets, 50+ benchmarks, and 120+ method papers
  • Comprehensive evaluation across 37 methods
Key resources
Jump directly to the main artifacts of the project.
BibTeX
@article{Liu_2025,
  title={Spatial Intelligence in Vision-Language Models: A Comprehensive Survey},
  url={http://dx.doi.org/10.36227/techrxiv.176231405.57942913/v2},
  DOI={10.36227/techrxiv.176231405.57942913/v2},
  publisher={Institute of Electrical and Electronics Engineers (IEEE)},
  author={Liu, Disheng and Liang, Tuo and Hu, Zhe and Peng, Jierui and Lu, Yiren and Xu, Yi and Fu, Yun and Yin, Yu},
  year={2025},
  month=nov
}