The proceedings contain 10 papers. The special focus in this conference is on Design and Architectures for signal and Image processing. The topics include: LiFT: Lightweight, FPGA-Tailored 3D Object Detection Based on...
ISBN:
(纸本)9783031878961
The proceedings contain 10 papers. The special focus in this conference is on Design and Architectures for signal and Image processing. The topics include: LiFT: Lightweight, FPGA-Tailored 3D Object Detection Based on LiDAR Data;A Practical HW-Aware NAS Flow for AI vision Applications on Embedded Heterogeneous SoCs;Endoscopy Image Classification for Wireless Capsules with CNNs on Microcontroller-Based Platforms;joint Underwater Depth Estimation and Dehazing from a Single Image Using Attention U-Net;KD-AHOSVD: Neural Network Compression via Knowledge Distillation and Tensor Decomposition;Novel Scheduling and Shifter Networks for 5G LDPC Decoders;Comparison Between In-Core Hardware IDS, Off-Core Hardware IDS and Software IDS;comparative Study of Memory Optimization Techniques for Dataflow-Modeled Applications.
The Mamba-based model has demonstrated outstanding performance across tasks in computer vision, natural language processing, and speech processing. However, in the realm of speech processing, the Mamba-based model'...
详细信息
Current visual captioning technologies typically transform 3D/2D visual information into one-dimensional sequential data and employ language models to generate corresponding descriptions. This approach, however, compr...
详细信息
It is a new trend to fine-tune Large Multimodal Models (LMMs) to adapt to specific visual tasks through task-related conversation data. This approach provides a new paradigm for solving various vision-language tasks, ...
详细信息
Video-based Person Re-identification (ReID) is crucial in visual surveillance, focusing on matching video snippets of individuals across multiple non-overlapping cameras. Existing methods either conduct ReID at the im...
详细信息
As object detectors are increasingly deployed as black-box cloud services or pre-trained models with restricted access to the original training data, the challenge of zero-shot object-level out-of-distribution (OOD) d...
详细信息
This paper addresses few-shot semantic segmentation (FSS) guided by text, where we classify unseen novel classes using image and text references as in-context examples, without the need for training. We enhance the qu...
详细信息
Despite the rapid advancement in the field of image recognition, the processing of high-resolution imagery remains a computational challenge. However, this processing is pivotal for extracting detailed object insights...
详细信息
Spiking Neural Networks (SNNs) have emerged as a popular spatio-temporal computing paradigm for complex vision tasks. Recently proposed SNN training algorithms have significantly reduced the number of time steps (down...
Recent research highlights the potential of multimodal foundation models in tackling complex decision-making challenges. However, their large parameters make real-world deployment resource-intensive and often impracti...
详细信息
暂无评论