今日从 arXiv 订阅中筛选 8 篇论文。
⚡ HiVLA A Visual-Grounded-Centric Hierarchical Embodied Manipulation System

⚡ SpatialEvo Self-Evolving Spatial Intelligence via Deterministic Geometric Environments

⚡ POINTS-Seeker Towards Training a Multimodal Agentic Search Model from Scratch

⚡ Training-Free Semantic Multi-Object Tracking with Vision-Language Models

⚡ Beyond State Consistency Behavior Consistency in Text-Based World Models

⚡ One Token per Highly Selective Frame Towards Extreme Compression for Long Video Understanding

⚡ Exploration and Exploitation Errors Are Measurable for Language Model Agents

自动生成于 2026-04-17 · 基于 arXiv Daily Digest
