Mental health is an important aspect of life that affects overall well-being. Mental Health is termed as the state of mind’s well-being which helps people to better understand of situations and make better *** illnes...
详细信息
As one of the key technologies of intelligent vehicles, traffic sign detection is still a challenging task because of the tiny size of its target object. To address the challenge, we present a novel detection network ...
详细信息
As one of the key technologies of intelligent vehicles, traffic sign detection is still a challenging task because of the tiny size of its target object. To address the challenge, we present a novel detection network improved from yolo-v3 for the tiny traffic sign with high precision in real-time. First, a visual multi-scale attention module(MSAM), a light-weight yet effective module, is devised to fuse the multi-scale feature maps with channel weights and spatial masks. It increases the representation power of the network by emphasizing useful features and suppressing unnecessary ones. Second, we exploit effectively fine-grained features about tiny objects from the shallower layers through modifying backbone Darknet-53 and adding one prediction head to yolo-v3. Finally, a receptive field block is added into the neck of the network to broaden the receptive field. Experiments prove the effectiveness of our network in both quantitative and qualitative aspects. The m AP@0.5 of our network reaches 0.965 and its detection speed is55.56 FPS for 512 × 512 images on the challenging Tsinghua-Tencent 100 k(TT100 k) dataset.
Neural networks are lately more and more often being used in the context of data-driven control, as an approximate model of the true system dynamics. Model Predictive Control (MPC) adopts this practise leading to neur...
详细信息
This paper presents a novel segmentation model to address the limitations of existing zero-shot semantic segmentation models in handling global semantic understanding and small object segmentation. Zero-shot semantic ...
详细信息
Code search aims to retrieve relevant code snippets from large code repositories based on query, promoting code reuse and enhancing software development efficiency. Deep Learning is a powerful approach for code search...
详细信息
This paper proposes a novel Artificial Intelligence (AI) framework designed to enhance human-machine synergy through an intuitive and personalized approach. The framework integrates Large Action Models (LAMs), Large L...
详细信息
The market debut of ChatGPT gave rise to the development and deployment of various other Large Language Models (LLMs) that achieve state-of-the-art performance across various tasks. The growing popularity of these mod...
详细信息
This research presents and compares multiple approaches to automate the generation of literature reviews using several Natural Language Processing (NLP) techniques and retrieval-augmented generation (RAG) with a Large...
详细信息
Egocentric activity recognition, also known as first-person vision, captures human actions and activities from the perspective of a wearable camera, providing a personalized and contextualized viewpoint. This unique p...
详细信息
ISBN:
(纸本)9798350382723
Egocentric activity recognition, also known as first-person vision, captures human actions and activities from the perspective of a wearable camera, providing a personalized and contextualized viewpoint. This unique perspective is crucial for various real-world applications, including healthcare monitoring, sports analy-sis, augmented reality, and assistive technologies. In this paper, we present a comparative analysis of various state-of-the-art algorithms for egocentric action recognition using the recently curated dataset, 'Coer-Egovision.' This dataset captures human actions and activities from the unique perspective of a wearable camera, providing a personalized and context-aware viewpoint. Such a perspective holds significant value for real-world applications, including healthcare monitoring, sports analysis, augmented reality, and assistive *** 'Coer-Egovision' dataset encompasses diverse actions, such as 'Going-Downstairs', 'Going-Upstairs', 'Texting', 'Walking' and 'Writing' allowing for a challenging benchmark to evaluate algorithms in this domain. By leveraging the wearable camera perspective, the dataset addresses the need for a more com-prehensive understanding of human interactions with the *** the evaluated algorithms, the Long-term Recurrent Convolutional Network (LRCN) emerges as the top performer, achieving an impressive accuracy of 88% in effectively modeling long-term temporal dependencies. Additionally, the Histogram of Oriented Gradients (HOG) combined with Support Vector Machine (SVM) approach demonstrates robust performance with 82% accuracy, highlighting the efficacy of feature-based methods. The study also reveals promising results for the 2-Stream Approach with 3D CNN and LSTM, achieving an accuracy of 74%. Furthermore, pretrained models like VGG16 and DenseNet201 along with LSTM exhibit competitive performances with accuracies of 80% *** results offer valuable insights into the performance of diverse algo
This study explores the request of Mechanism Education (ML) techniques for predicting wine quality, aiming to enhance the understanding and precision in assessing the characteristics of different wine varieties. The d...
详细信息
暂无评论