Skip to content

Vision Transformer

Vision Transformer

Publish Date Title Authors PDF Last Updated Code
2025-07-02 Resolving Individual Stars in Nearby Large Galaxies with the Habitable Worlds Observatory Adam Smercina et.al. 2507.01960v1 2025-07-02 null
2025-07-02 AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation Sixiang Chen et.al. 2507.01961v2 2025-07-03 null
2025-07-02 Parallel-in-Time Preconditioning for Time-Dependent Variational Mean Field Games Heidi Wolles Ljósheim et.al. 2507.01958v1 2025-07-02 null
2025-07-02 Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation Zhuoyang Zhang et.al. 2507.01957v1 2025-07-02 null
2025-07-02 How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Rahul Ramachandran et.al. 2507.01955v1 2025-07-02 null
2025-07-02 FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model Yukang Cao et.al. 2507.01953v1 2025-07-02 null
2025-07-02 Kwai Keye-VL Technical Report Kwai Keye Team et.al. 2507.01949v1 2025-07-02 null
2025-07-02 LongAnimation: Long Animation Generation with Dynamic Global-Local Memory Nan Chen et.al. 2507.01945v1 2025-07-02 null
2025-07-02 SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars Xiaosheng Zhao et.al. 2507.01939v1 2025-07-02 null
2025-07-02 CI-VID: A Coherent Interleaved Text-Video Dataset Yiming Ju et.al. 2507.01938v1 2025-07-02 null
2025-07-02 Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations Wenhao Wang et.al. 2507.01930v2 2025-07-03 null
2025-07-02 evMLP: An Efficient Event-Driven MLP Architecture for Vision Zhentan Zheng et.al. 2507.01927v1 2025-07-02 null
2025-07-02 IC-Custom: Diverse Image Customization via In-Context Learning Yaowei Li et.al. 2507.01926v1 2025-07-02 null
2025-07-02 A Survey on Vision-Language-Action Models: An Action Tokenization Perspective Yifan Zhong et.al. 2507.01925v1 2025-07-02 null
2025-07-02 Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection Samirah Bakker et.al. 2507.01924v1 2025-07-02 null
2025-07-02 Initial boundary value problem for a system derived from Eulerian droplet model for air particle flow Kayyunnapara Divya Joseph et.al. 2507.01920v1 2025-07-02 null
2025-07-02 End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning Christian Bongiorno et.al. 2507.01918v1 2025-07-02 null
2025-07-02 3D Reconstruction and Information Fusion between Dormant and Canopy Seasons in Commercial Orchards Using Deep Learning and Fast GICP Ranjan Sapkota et.al. 2507.01912v1 2025-07-02 null
2025-07-02 Modality-agnostic, patient-specific digital twins modeling temporally varying digestive motion Jorge Tapias Gomez et.al. 2507.01909v2 2025-07-03 null
2025-07-02 Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning Qingdong He et.al. 2507.01908v1 2025-07-02 null
2025-07-02 STEM Diffraction Pattern Analysis with Deep Learning Networks Sebastian Wissel et.al. 2507.01889v1 2025-07-02 null
2025-07-02 Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification Kunlun Xu et.al. 2507.01884v1 2025-07-02 null
2025-07-02 Future Slot Prediction for Unsupervised Object Discovery in Surgical Video Guiqiu Liao et.al. 2507.01882v1 2025-07-02 null
2025-07-02 A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs Niccolò McConnell et.al. 2507.01881v1 2025-07-02 null
2025-07-02 Generalized ODE reduction algorithm for bounded degree transformation Shaoxuan Huang et.al. 2507.01878v2 2025-07-03 null
2025-07-02 MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics Dmytro Kuzmenko et.al. 2507.01843v1 2025-07-02 null
2025-07-02 Modeling the Deterioration of Pavement Skid Resistance and Surface Texture After Preventive Maintenance Lu Gao et.al. 2507.01842v1 2025-07-02 null
2025-07-02 MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices Hailong Yan et.al. 2507.01838v1 2025-07-02 null
2025-07-02 Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views Daniil Reutsky et.al. 2507.01835v1 2025-07-02 null
2025-07-02 mGRADE: Minimal Recurrent Gating Meets Delay Convolutions for Lightweight Sequence Modeling Tristan Torchet et.al. 2507.01829v1 2025-07-02 null