Vision Transformer

Publish Date	Title	Authors	PDF	Last Updated	Code
2025-07-02	Resolving Individual Stars in Nearby Large Galaxies with the Habitable Worlds Observatory	Adam Smercina et.al.	2507.01960v1	2025-07-02	null
2025-07-02	AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation	Sixiang Chen et.al.	2507.01961v2	2025-07-03	null
2025-07-02	Parallel-in-Time Preconditioning for Time-Dependent Variational Mean Field Games	Heidi Wolles Ljósheim et.al.	2507.01958v1	2025-07-02	null
2025-07-02	Locality-aware Parallel Decoding for Efficient Autoregressive Image Generation	Zhuoyang Zhang et.al.	2507.01957v1	2025-07-02	null
2025-07-02	How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks	Rahul Ramachandran et.al.	2507.01955v1	2025-07-02	null
2025-07-02	FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model	Yukang Cao et.al.	2507.01953v1	2025-07-02	null
2025-07-02	Kwai Keye-VL Technical Report	Kwai Keye Team et.al.	2507.01949v1	2025-07-02	null
2025-07-02	LongAnimation: Long Animation Generation with Dynamic Global-Local Memory	Nan Chen et.al.	2507.01945v1	2025-07-02	null
2025-07-02	SpecCLIP: Aligning and Translating Spectroscopic Measurements for Stars	Xiaosheng Zhao et.al.	2507.01939v1	2025-07-02	null
2025-07-02	CI-VID: A Coherent Interleaved Text-Video Dataset	Yiming Ju et.al.	2507.01938v1	2025-07-02	null
2025-07-02	Large Language Model-Driven Closed-Loop UAV Operation with Semantic Observations	Wenhao Wang et.al.	2507.01930v2	2025-07-03	null
2025-07-02	evMLP: An Efficient Event-Driven MLP Architecture for Vision	Zhentan Zheng et.al.	2507.01927v1	2025-07-02	null
2025-07-02	IC-Custom: Diverse Image Customization via In-Context Learning	Yaowei Li et.al.	2507.01926v1	2025-07-02	null
2025-07-02	A Survey on Vision-Language-Action Models: An Action Tokenization Perspective	Yifan Zhong et.al.	2507.01925v1	2025-07-02	null
2025-07-02	Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection	Samirah Bakker et.al.	2507.01924v1	2025-07-02	null
2025-07-02	Initial boundary value problem for a system derived from Eulerian droplet model for air particle flow	Kayyunnapara Divya Joseph et.al.	2507.01920v1	2025-07-02	null
2025-07-02	End-to-End Large Portfolio Optimization for Variance Minimization with Neural Networks through Covariance Cleaning	Christian Bongiorno et.al.	2507.01918v1	2025-07-02	null
2025-07-02	3D Reconstruction and Information Fusion between Dormant and Canopy Seasons in Commercial Orchards Using Deep Learning and Fast GICP	Ranjan Sapkota et.al.	2507.01912v1	2025-07-02	null
2025-07-02	Modality-agnostic, patient-specific digital twins modeling temporally varying digestive motion	Jorge Tapias Gomez et.al.	2507.01909v2	2025-07-03	null
2025-07-02	Reasoning to Edit: Hypothetical Instruction-Based Image Editing with Visual Reasoning	Qingdong He et.al.	2507.01908v1	2025-07-02	null
2025-07-02	STEM Diffraction Pattern Analysis with Deep Learning Networks	Sebastian Wissel et.al.	2507.01889v1	2025-07-02	null
2025-07-02	Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification	Kunlun Xu et.al.	2507.01884v1	2025-07-02	null
2025-07-02	Future Slot Prediction for Unsupervised Object Discovery in Surgical Video	Guiqiu Liao et.al.	2507.01882v1	2025-07-02	null
2025-07-02	A computationally frugal open-source foundation model for thoracic disease detection in lung cancer screening programs	Niccolò McConnell et.al.	2507.01881v1	2025-07-02	null
2025-07-02	Generalized ODE reduction algorithm for bounded degree transformation	Shaoxuan Huang et.al.	2507.01878v2	2025-07-03	null
2025-07-02	MoIRA: Modular Instruction Routing Architecture for Multi-Task Robotics	Dmytro Kuzmenko et.al.	2507.01843v1	2025-07-02	null
2025-07-02	Modeling the Deterioration of Pavement Skid Resistance and Surface Texture After Preventive Maintenance	Lu Gao et.al.	2507.01842v1	2025-07-02	null
2025-07-02	MobileIE: An Extremely Lightweight and Effective ConvNet for Real-Time Image Enhancement on Mobile Devices	Hailong Yan et.al.	2507.01838v1	2025-07-02	null
2025-07-02	Modulate and Reconstruct: Learning Hyperspectral Imaging from Misaligned Smartphone Views	Daniil Reutsky et.al.	2507.01835v1	2025-07-02	null
2025-07-02	mGRADE: Minimal Recurrent Gating Meets Delay Convolutions for Lightweight Sequence Modeling	Tristan Torchet et.al.	2507.01829v1	2025-07-02	null