Emotion recognition via electroencephalography (EEG) has emerged as a pivotal domain in biomedical signalprocessing, offering valuable insights into affective states. This paper presents a novel approach utilizing a ...
详细信息
Path optimization is one of the crucial problems in unmanned ground vehicles. It is associated with various concerns such as smooth trajectory, collision free planning, less time/space complexity, etc. Rapid Exploring...
详细信息
With the continuous advancement of artificial intelligence technology, intelligent robots are increasingly and extensively utilized in industrial production, daily life, as well as detection and security. This paper p...
详细信息
We present Fast-Slow Transformer for Visually Grounding Speech, or FaST-VGS. FaST-VGS is a Transformer-based model for learning the associations between raw speech waveforms and visual images. The model unifies dual-e...
详细信息
ISBN:
(纸本)9781665405409
We present Fast-Slow Transformer for Visually Grounding Speech, or FaST-VGS. FaST-VGS is a Transformer-based model for learning the associations between raw speech waveforms and visual images. The model unifies dual-encoder and cross-attention architectures into a single model, reaping the superior retrieval speed of the former along with the accuracy of the latter. FaST-VGS achieves state-of-the-art speech-image retrieval accuracy on benchmark datasets, and its learned representations exhibit strong performance on the ZeroSpeech 2021 phonetic and semantic tasks.
In addition to easy access to audio on the Internet, developments in deep learning methods have made it possible to produce deep fake audio. Deep fake audio spoofing is carried out with the aim of producing audio file...
详细信息
Aiming at the problems of slow convergence speed and easy to fall into the local optimal solution in the path planning of mobile robots in the application of traditional ant colony algorithm, a strategy to update the ...
详细信息
This study proposes an unmanned aerial vehicle (UAV) system that combines computer vision object identification technology with laser positioning technology to increase the automation level and lower the expenses asso...
详细信息
The image edge is a region in the image that exhibits a clear discontinuity and change, which can serve as a reflection of the most fundamental attributes of the image. The research concentration in the field of image...
详细信息
ISBN:
(纸本)9798400709647
The image edge is a region in the image that exhibits a clear discontinuity and change, which can serve as a reflection of the most fundamental attributes of the image. The research concentration in the field of image processing and computer vision is also on edge detection technology, which is employed to ascertain the contour details between various objects and regions of the image. This paper first examines the principles and methodologies of traditional edge detection algorithms, succinctly outlining the advantages and disadvantages of the Roberts, Prewitt, Sobel, and Canny operators. It then introduces representative algorithms for image edge detection based on fractional-order differentiation. Finally, it compares traditional edge detection algorithms with those based on fractional-order differentiation through experimental analysis, demonstrating that the latter exhibits superior performance in contour continuity and detail integrity. The application of fractional-order differential theory in image processing demonstrates significant potential.
DR (Diabetic retinopathy) a chronic progressive disease which affects eyesight and even causes blindness. It is significance to carry out the identification and severity diagnosis of DR, timely diagnosis and treatment...
详细信息
Implicit Neural Representations (INRs) are powerful to parameterize continous signals in computer vision. However, almost all INRs methods are limited to low-level tasks, e.g., image/video compression, super-resolutio...
详细信息
暂无评论