Recently, unsupervised image denoising methods learning from paired noisy samples have received increasing attention. These methods build on the idea that the mean of multiple noisy images of the same scene is the ide...
详细信息
Unmanned Aerial Vehicle (UAV) detection in the wild is a challenging task due to the presence of background noise and the varying size of the object. To address these obstacles, we propose a novel learning framework f...
详细信息
The growing interest in generating recipes from food images has drawn substantial research attention in recent years. Existing works for recipe generation primarily utilize a two-stage training method - first predicti...
详细信息
The Blocky Volume Package (BVP) format is a distributed, platform-independent and API-independent format for storing static and temporal volumetric data. It is designed for efficient transfer over a network by support...
详细信息
Large language models (LLMs) can perform complex reasoning by generating intermediate thoughts under zero-shot or few-shot settings. However, zero-shot prompting always encounters low performance, and the superior per...
详细信息
Video diffusion models are able to generate high-quality videos by learning strong spatial-temporal priors on large-scale datasets. In this paper, we aim to investigate whether such priors derived from a generative pr...
Diplomacy is one of the most sophisticated activities in human society, involving complex interactions among multiple parties that require skills in social reasoning, negotiation, and long-term strategic planning. Pre...
The Large Vision-Language Models (LVLMs) have demonstrated great abilities in image perception and language understanding. However, existing datasets either focus solely on primary perception abilities and commonsense...
详细信息
This paper presents an integrated solution for 3D object detection, recognition, and presentation to increase accessibility for various user groups in indoor areas through a mobile application. The system has three ma...
详细信息
Simultaneous Localization and Mapping (SLAM) plays a pivotal role in autonomous driving and robotics. Existing methods often rely on hand-craft feature extraction and cross-modal fusion techniques, resulting in limite...
详细信息
暂无评论