今日从 arXiv 订阅中筛选 8 篇论文。

⚡ OneVL One-Step Latent Reasoning and Planning with Vision-Language Explanation

OneVL One-Step Latent Reasoning and Planning with Vision-Language Explanation

⚡ Waking Up Blind Cold-Start Optimization of Supervision-Free Agentic Trajectories for Grounded Visual Perception

Waking Up Blind Cold-Start Optimization of Supervision-Free Agentic Trajectories for Grounded Visual Perception

⚡ OneDrive Unified Multi-Paradigm Driving with Vision-Language-Action Models

⚡ Infrastructure-Centric World Models Bridging Temporal Depth and Spatial Breadth for Roadside Perception

⚡ Dual-Anchoring Addressing State Drift in Vision-Language Navigation

Dual-Anchoring Addressing State Drift in Vision-Language Navigation

⚡ OASIS On-Demand Hierarchical Event Memory for Streaming Video Reasoning

OASIS On-Demand Hierarchical Event Memory for Streaming Video Reasoning

⚡ The Global Neural World Model Spatially Grounded Discrete Topologies for Action-Conditioned Planning

⚡ AutoVQA-G Self-Improving Agentic Framework for Automated Visual Question Answering and Grounding Annotation

AutoVQA-G Self-Improving Agentic Framework for Automated Visual Question Answering and Grounding Annotation

自动生成于 2026-04-21 · 基于 arXiv Daily Digest