arXiv · 2025-12-30 · High-Level Structured Planning | Low-Level Learning-Based Action Modelling / Multimodal Reasoning / Robot Reasoning | Input Modelling / 2D LLM-based Vision Language Action Models