2025-07-02 |
Resolving Individual Stars in Nearby Large Galaxies with the Habitable Worlds Observatory |
Adam Smercina et.al. |
2507.01960v1 |
2025-07-02 |
null |
2025-07-02 |
AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation |
Sixiang Chen et.al. |
2507.01961v2 |
2025-07-03 |
null |
2025-07-02 |
Parallel-in-Time Preconditioning for Time-Dependent Variational Mean Field Games |
Heidi Wolles Ljósheim et.al. |
2507.01958v1 |
2025-07-02 |
null |
2025-07-02 |
Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation |
Zhuoyang Zhang et.al. |
2507.01957v1 |
2025-07-02 |
null |
2025-07-02 |
How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks |
Rahul Ramachandran et.al. |
2507.01955v1 |
2025-07-02 |
null |
2025-07-02 |
FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model |
Yukang Cao et.al. |
2507.01953v1 |
2025-07-02 |
null |
2025-07-02 |
Kwai Keye-VL Technical Report |
Kwai Keye Team et.al. |
2507.01949v1 |
2025-07-02 |
null |
2025-07-02 |
LongAnimation: Long Animation Generation with Dynamic Global-Local Memory |
Nan Chen et.al. |
2507.01945v1 |
2025-07-02 |
null |
2025-07-02 |
SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars |
Xiaosheng Zhao et.al. |
2507.01939v1 |
2025-07-02 |
null |
2025-07-02 |
CI-VID: A Coherent Interleaved Text-Video Dataset |
Yiming Ju et.al. |
2507.01938v1 |
2025-07-02 |
null |
2025-07-02 |
Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations |
Wenhao Wang et.al. |
2507.01930v2 |
2025-07-03 |
null |
2025-07-02 |
evMLP: An Efficient Event-Driven MLP Architecture for Vision |
Zhentan Zheng et.al. |
2507.01927v1 |
2025-07-02 |
null |
2025-07-02 |
IC-Custom: Diverse Image Customization via In-Context Learning |
Yaowei Li et.al. |
2507.01926v1 |
2025-07-02 |
null |
2025-07-02 |
A Survey on Vision-Language-Action Models: An Action Tokenization Perspective |
Yifan Zhong et.al. |
2507.01925v1 |
2025-07-02 |
null |
2025-07-02 |
Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection |
Samirah Bakker et.al. |
2507.01924v1 |
2025-07-02 |
null |
2025-07-02 |
Initial boundary value problem for a system derived from Eulerian droplet model for air particle flow |
Kayyunnapara Divya Joseph et.al. |
2507.01920v1 |
2025-07-02 |
null |
2025-07-02 |
End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning |
Christian Bongiorno et.al. |
2507.01918v1 |
2025-07-02 |
null |
2025-07-02 |
3D Reconstruction and Information Fusion between Dormant and Canopy Seasons in Commercial Orchards Using Deep Learning and Fast GICP |
Ranjan Sapkota et.al. |
2507.01912v1 |
2025-07-02 |
null |
2025-07-02 |
Modality-agnostic, patient-specific digital twins modeling temporally varying digestive motion |
Jorge Tapias Gomez et.al. |
2507.01909v2 |
2025-07-03 |
null |
2025-07-02 |
Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning |
Qingdong He et.al. |
2507.01908v1 |
2025-07-02 |
null |
2025-07-02 |
STEM Diffraction Pattern Analysis with Deep Learning Networks |
Sebastian Wissel et.al. |
2507.01889v1 |
2025-07-02 |
null |
2025-07-02 |
Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification |
Kunlun Xu et.al. |
2507.01884v1 |
2025-07-02 |
null |
2025-07-02 |
Future Slot Prediction for Unsupervised Object Discovery in Surgical Video |
Guiqiu Liao et.al. |
2507.01882v1 |
2025-07-02 |
null |
2025-07-02 |
A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs |
Niccolò McConnell et.al. |
2507.01881v1 |
2025-07-02 |
null |
2025-07-02 |
Generalized ODE reduction algorithm for bounded degree transformation |
Shaoxuan Huang et.al. |
2507.01878v2 |
2025-07-03 |
null |
2025-07-02 |
MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics |
Dmytro Kuzmenko et.al. |
2507.01843v1 |
2025-07-02 |
null |
2025-07-02 |
Modeling the Deterioration of Pavement Skid Resistance and Surface Texture After Preventive Maintenance |
Lu Gao et.al. |
2507.01842v1 |
2025-07-02 |
null |
2025-07-02 |
MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices |
Hailong Yan et.al. |
2507.01838v1 |
2025-07-02 |
null |
2025-07-02 |
Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views |
Daniil Reutsky et.al. |
2507.01835v1 |
2025-07-02 |
null |
2025-07-02 |
mGRADE: Minimal Recurrent Gating Meets Delay Convolutions for Lightweight Sequence Modeling |
Tristan Torchet et.al. |
2507.01829v1 |
2025-07-02 |
null |