今日从 arXiv 订阅中筛选 8 篇论文。

⚡ Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction

Frozen LLMs as Map-Aware Spatio-Temporal Reasoners for Vehicle Trajectory Prediction

⚡ Grounding Video Reasoning in Physical Signals

⚡ HiCrew Hierarchical Reasoning for Long-Form Video Understanding via Question-Aware Multi-Agent Collaboration

HiCrew Hierarchical Reasoning for Long-Form Video Understanding via Question-Aware Multi-Agent Collaboration

⚡ Thinking Like a Botanist Challenging Multimodal Language Models with Intent-Driven Chain-of-Inquiry

⚡ Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs

Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs

⚡ Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment

Reinforcing 3D Understanding in Point-VLMs via Geometric Reward Credit Assignment

⚡ Do MLLMs Understand Pointing Benchmarking and Enhancing Referential Reasoning in Egocentric Vision

Do MLLMs Understand Pointing Benchmarking and Enhancing Referential Reasoning in Egocentric Vision

⚡ Encoder-Free Human Motion Understanding via Structured Motion Descriptions

Encoder-Free Human Motion Understanding via Structured Motion Descriptions

自动生成于 2026-04-24 · 基于 arXiv Daily Digest