今日从 arXiv 订阅中筛选 8 篇论文。

⚡ Emergent Semantic Representations in World Models through Physical Interaction without Linguistic Supervision

⚡ VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency VLA Policies

VisualThink-VLA: Visual Intermediate Reasoning for Effective and Low-Latency VLA Policies

⚡ minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

minWM: A Full-Stack Open-Source Framework for Real-Time Interactive Video World Models

⚡ V2XCrafter: Learning to Generate Driving Scene Across Agents

V2XCrafter: Learning to Generate Driving Scene Across Agents

⚡ Mitigating State Aliasing in VLA Models via Inverse Dynamics Learning

Mitigating State Aliasing in VLA Models via Inverse Dynamics Learning

⚡ BitTP: The Lightweight Trajectory Prediction Model with BitLLM for Edge-Devices

⚡ Planning with the Views via Scene Self-Exploration

Planning with the Views via Scene Self-Exploration

⚡ YoCausal: How Far is Video Generation from World Model? A Causality Perspective

YoCausal: How Far is Video Generation from World Model? A Causality Perspective

自动生成于 2026-05-30 · 基于 arXiv Daily Digest