今日从 arXiv 订阅中筛选 8 篇论文。
⚡ OneVL One-Step Latent Reasoning and Planning with Vision-Language Explanation

⚡ Waking Up Blind Cold-Start Optimization of Supervision-Free Agentic Trajectories for Grounded Visual Perception

⚡ OneDrive Unified Multi-Paradigm Driving with Vision-Language-Action Models
⚡ Infrastructure-Centric World Models Bridging Temporal Depth and Spatial Breadth for Roadside Perception
⚡ Dual-Anchoring Addressing State Drift in Vision-Language Navigation

⚡ OASIS On-Demand Hierarchical Event Memory for Streaming Video Reasoning

⚡ The Global Neural World Model Spatially Grounded Discrete Topologies for Action-Conditioned Planning
⚡ AutoVQA-G Self-Improving Agentic Framework for Automated Visual Question Answering and Grounding Annotation

自动生成于 2026-04-21 · 基于 arXiv Daily Digest