检索结果-内蒙古大学图书馆

27th International Conference on Pattern Recognition, ICPR 2024

作者： Liang, Yiming Ishikawa, Hiroshi Department of Computer Science and Communications Engineering Waseda University Tokyo Japan

ISBN: (纸本)9783031801358

Glass, though ubiquitous, is difficult to recognize in an image due to its transparency. Fine-grained low-level features indicating the presence of glass, such as refraction and reflection, are weak and subtle. This causes difficulties for existing glass detection models in learning those features, pushing them to rely on more overt cues, especially the frame surrounding the glass. Consequently, they can be fooled easily by frame-like objects. Here, we propose a simple data augmentation scheme called Random Frame to address this problem. Random Frame inserts a frame into an image to create an area with a frame but no glass. The model will receive a penalty if it only relies on the frame. The performances of existing models on various datasets improve when Random Frame is applied while being trained. Our comprehensive experiments demonstrate that our data augmentation can make models utilize more low-level features with more confidence in their predictions. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Image recognition

来源：评论

学校读者我要写书评

暂无评论

Signbridge-Audio to Sign Language Translator

Signbridge-Audio to Sign Language Translator

引用

2025 IEEE International Students' Conference on Electrical, Electronics and computer science, SCEECS 2025

作者： Shirisha, Kammadanam Deeksith, E. Mani Madhava, M. S. S Sri Surendhar, S. Karthikeyan, R. Vardhaman College of Engineering Department of Computer Science and Engineering Hyderabad India Hyderabad India

ISBN: (纸本)9798331529833

The lack of communication options for Deaf and hearing people, some may say creates a significant social disadvantage in accessing the often-bare essential services. In contrast to acoustically communicated sound patterns, sign language communicates ideas freely through manual and body language. It can be used by those who have trouble speaking, by those who can hear but cannot talk, and by regular people to interact with those with hearing impairments. By automatically converting spoken words into hand gestures, this project creates a web-based interface that allows hearing-impaired individuals to communicate with normal people in real time through sign language interpretation. Two major phases of the system Speech-to-text technology interprets oral input into textual output. Next, the text is run through Natural Language Processing (NLP) algorithms leveraging the Natural language toolkit (NLTK) to be syntactically parsed in the context of sign natural rules. The last phase translates the parsed text into sign language gestures which involve hand shapes, orientation, and body movements to convey the visual meaning of a message. This system could be used to greatly reduce the communication barriers experienced by people with hearing loss and deafness using Machine Learning (ML) to continuously improve accuracy, enhance quality of life, and provide a more inclusive society between the deaf community in our daily interactions within an ever-growing world. © 2025 IEEE.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Novel ACGAN-Based Framework Fault Detection Technique for Unbalanced Data 7

Novel ACGAN-Based Framework Fault Detection Technique for Un...

引用

7th International Conference on Advanced Algorithms and Control engineering, ICAACE 2024

作者： Wang, Haodong Lyu, Jianhua Chen, Linchao Zhang, Baili College Of Computer Science And Engineering Southeast University Nanjing China

ISBN: (纸本)9798350361445

Fault detection is of great significance in ensuring the operational safety of industrial equipment. However, in practice, it is difficult to obtain fault data because industrial equipment is in normal operation most of the time, and at the same time, the huge difference in the probability of occurrence of different faults leads to the problem of imbalance of fault data. These problems will affect the generalization ability and robustness of deep learning models. In order to solve the above problems, this paper proposes a multi-generators based ACGAN network. Firstly, the number of generators is determined according to the number of fault categories, to ensure that each generator can generate a single type of fault data. On this basis, this paper proposes the concept of 'gene' to represent the category fault information, and uses the category genes to modify the noise input to get the noise with different mean and variance, so that it contains more effective features. Finally, the data imbalance fault detection method proposed in this paper is applied to the fault data set of rail vehicle door system for experimental verification and analysis. The experimental results show that the method in this paper can effectively improve the performance of fault detection and achieve the expected goal. © 2024 IEEE.

关键词： Fault detection

来源：评论

学校读者我要写书评

暂无评论

GAN-Based Image Restoration and Colorization 8th

GAN-Based Image Restoration and Colorization

引用

8th International Conference on Emerging Research in Computing, Information, Communication and Applications, ERCICA 2023

作者： Kabeer, Aliyah Tanna, Manali Milinda, K.N. Rizwan, Mohammed Uzair Agarwal, Pooja Department of Computer Science and Engineering PES University Bangalore India

ISBN: (纸本)9789819976324

The importance of images in today’s society has made it essential for them to be of the highest quality and visually indicative of their essential traits and attributes. Significant research has been done individually on colorizing and restoring degraded images. Separate studies of Generative Adversarial Networks (GANs) have also been conducted in each of these fields. However, it’s rare to find GAN architectures that can focus on both the tasks at once. With an emphasis on nature photographs, this study proposes a unique GAN architecture that was trained on a customized image dataset including images of landscapes, flowers, and mountains combined with the GoPro Light dataset. The proposed methodology makes use of a combination of different loss functions that enable the model to focus on both tasks simultaneously. Alongside the L1 loss and adversarial loss traditionally used in GANs, the proposed model includes the perceptual loss that performs feature-wise comparisons between images to restore its inherent features. To prove that the GAN can perform both restoration and colorization, its performance has been compared with other models that perform each of the two tasks separately. The model is tested on the curated dataset and evaluated on image-specific metrics like peak signal-to-noise ratio (PSNR) and structural similarity Index (SSIM). The model gives results that compare well with existing models, and it can colorize and restore images that have been degraded with motion blur or camera misfocus—successfully striking a good balance between the two tasks. The paper concludes by providing insight into the future work that can be carried out. © 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Restoration

来源：评论

学校读者我要写书评

暂无评论

A Comparative Analysis of Alzheimer's Disease Detection using Deep Learning 3

A Comparative Analysis of Alzheimer's Disease Detection usin...

引用

3rd International Conference on Communication, Control, and Intelligent Systems, CCIS 2024

作者： Sharma, Garvit Gupta, Dinki Bhardwaj, Priyanshu Ramneet Verma, Pawan Sharda University Computer Science & Engineering Greater Noida India

ISBN: (纸本)9798331528201

Alzheimer's is one of the progressively debilitating conditions, and it currently affects millions of people worldwide with no definitive medication for treatment. Understanding the nature of this disease is very important and stands to achieve early intervention and management. The development of models to recognize individuals having Alzheimer's based on neuroimaging data, particularly MRI scans, has thus become accessible through recently advanced machine learning and deep learning algorithms. This paper does a comparison of some of the leading algorithms. Models used for research in this study are Support Vector Machines, Convolutional Neural Networks, 3D Convolutional Neural Networks, and hybrid pre-trained models. There's model differentiation which resulted in pointing out the gap in the system and weaknesses of each approach. Apart from this, the discovery of the Explainable AI (XAI) concept regarding insight in the predictions of the algorithms has also been emphasized recently. As a result, XAI can improve the interpretability of AI systems. With this perspective, the degree of insight could lead toward better outcomes as far as diagnosis and responses in clinical environments are concerned. Given the importance of the paper, it will suffice to contribute to further refinements of diagnostic methodologies for AD and inform future studies on the subject. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

A Comprehensive Survey on Traffic Light Detection Sensors for Vehicle Safety and Automatic Emergency Braking Systems Using Deep Learning Techniques 3rd

A Comprehensive Survey on Traffic Light Detection Sensors fo...

引用

3rd International Conference on Intelligent Systems and Sustainable Computing, ICISSC 2023

作者： Gowri, B. Shyamala Priya, G. Vishnu Anjana, S. Arjun, V. Kodal, Devendraa Department of Computer Science and Engineering Easwari Engineering College Tamil Nadu Chennai India

ISBN: (纸本)9789819783540

The integration of Traffic Light Detection (TLD) systems with Advanced Emergency Braking Systems (AEBS) marks a critical milestone in enhancing road safety and paving the way for advanced autonomous driving. This survey paper provides a panoramic and extensive overview of the state-of-the-art TLD solutions leveraging sensors and deep learning techniques. With an increasing emphasis on accident prevention and traffic management, the intersection of TLD and AEBS has become a focal point of research and development. This survey begins by elucidating the fundamental challenges associated with TLD, including varying environmental conditions, occlusions, and complex traffic scenarios. We explore the pivotal role of sensors such as cameras, LiDAR, and radar in providing the requisite data for TLD, and delve into the intricacies of sensor fusion techniques for enhanced perception. Deep Learning has emerged as a cornerstone technology in TLD, enabling robust object detection, classification, and real-time decision-making. We meticulously analyze a spectrum of deep learning architectures including Single-Shot Detectors (SSD), Faster R-CNN, YOLO, and custom-designed networks tailored for TLD applications. Furthermore, the survey examines critical components of the TLD pipeline, encompassing data collection, preprocessing, model training, real-time inference, and integration with AEBS. Emphasis is placed on real-time constraints, multi-modal sensor fusion, and adaptability to diverse traffic light configurations. The paper also delves into the significance of accurate traffic light state prediction, going beyond mere detection to anticipate traffic light changes and optimize vehicle control actions. Human-centric interaction and privacy concerns are addressed, encompassing driver warnings, user interfaces, and data anonymization strategies. Moreover, the survey discusses the importance of safety, validation, and collaboration within the TLD and AEBS ecosystem, emphasizing compl

关键词： Road and street markings

来源：评论

学校读者我要写书评

暂无评论

Visual Topic Semantic Enhanced Machine Translation for Multi-Modal Data Efficiency

引用

Journal of computer science & Technology 2023年第6期38卷 1223-1236页

作者：王超蔡思佳史北祥崇志宏 School of Computer Science and Engineering Southeast UniversityNanjing 210096China School of Architecture Southeast UniversityNanjing 210096China

The scarcity of bilingual parallel corpus imposes limitations on exploiting the state-of-the-art supervised translation *** of the research directions is employing relations among multi-modal data to enhance ***,the reliance on manually annotated multi-modal datasets results in a high cost of data *** this paper,the topic semantics of images is proposed to alleviate the above ***,topic-related images can be auto-matically collected from the Internet by search ***,topic semantics is sufficient to encode the relations be-tween multi-modal data such as texts and ***,we propose a visual topic semantic enhanced translation(VTSE)model that utilizes topic-related images to construct a cross-lingual and cross-modal semantic space,allowing the VTSE model to simultaneously integrate the syntactic structure and semantic *** the above process,topic similar texts and images are wrapped into groups so that the model can extract more robust topic semantics from a set of similar images and then further optimize the feature *** results show that our model outperforms competitive base-lines by a large margin on the Multi30k and the Ambiguous COCO *** model can use external images to bring gains to translation,improving data efficiency.

关键词： multi-modal machine translation visual topic semantics data efficiency

来源：评论

学校读者我要写书评

暂无评论

A Hybrid Approach to Sign Language Recognition Using MediaPipe and Machine Learning

A Hybrid Approach to Sign Language Recognition Using MediaPi...

引用

2025 IEEE International Students' Conference on Electrical, Electronics and computer science, SCEECS 2025

作者： Bansal, Vidhi Sinha, Sandali Astya, Rani Sagar, Anil Kumar Sahu, Kalicharan Sharda University Department of Computer Science and Engineering Greater Noida India

ISBN: (纸本)9798331529833

This paper provides an efficient and accurate sign language recognition system in real time that understands gestures employed using MediaPipe and Random Forest in American Sign Language (ASL). The system captures and processes hand gestures in real time, allowing for optimized responses on the trained dataset with the extended feature for a user to create its own database in just a few clicks. MediaPipe framework is designed to provide powerful hand tracking abilities and is used to capture and refine landmarks which in turn contributes in improving gesture recognition accuracy. The processed data which is captured with OpenCV library is fed into the model trained on the dataset created by authors resulting in 99.81% accuracy in recognizing hand gestures. The findings emphasize the system's potential applications in communication, education, and accessibility, emphasizing its importance for people who have hearing impairments. The combination of MediaPipe with Random Forest and CNN provides a practical solution for a precise real-time sign language detection with a user-friendly model. © 2025 IEEE.

关键词： Random forests

来源：评论

学校读者我要写书评

暂无评论

Self-Driving Car Using Neural Networks 11th

Self-Driving Car Using Neural Networks

引用

11th International Conference on Intelligent Computing and Applications, ICRTC 2023

作者： Kumawat, Pushpak Pandey, Nidhi Sharma, Oshin Department of Computer Science and Engineering SRM IST Trichy India

ISBN: (纸本)9789819717231

Without much assistance from a person, a driverless automobile can sense its surroundings and navigate challenges like traffic. Although it took years of discussion and development, many industries have delivered the most recent new technology in the automotive industry. These vehicles have just started appearing as private and public transportation (taxis, etc.) in global markets. With this product development, numerous businesses are involved. With this type of vehicle, all motor transportation is more efficient, secure, and safe, and human mistakes may be avoided while driving is done at its best. The idea of over-taking obstacles or moving vehicle response and lane detection using canny edges, given things are missing from the present models, has been implemented into this project, making it possible to achieve the aforementioned benefits more readily and inexpensively. This type of gadget can revolutionise transportation for people with disabilities and enable blind people or everyone to move about independently. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.

关键词： Autonomous vehicles

来源：评论

学校读者我要写书评

暂无评论

Application and Optimization of Multi-agent Reinforcement Learning in Collaborative Decision-Making 8th

Application and Optimization of Multi-agent Reinforcement Le...

引用

8th International Conference on Cognitive Computing, ICCC 2024, Held as Part of the Services Conference Federation, SCF 2024

作者： Sun, Qi Chen, Zhihao Liu, Han College of Computer Science and Software Engineering Hohai University Nanjing China

ISBN: (纸本)9783031779534

With the rapid development of intelligent systems, Multi-Agent Systems (MAS) have shown unique advantages in solving complex decision-making problems. Particularly in the field of Multi-Agent Reinforcement Learning (MARL), Multiple agents can decompose complex tasks, process information and make decisions in parallel, share experiences, accelerate the learning process, and significantly improve decision quality and efficiency. This paper explores the theoretical underpinnings of MARL and its application to collaborative decision-making and analyzes practical cases in areas such as transportation system management, automated manufacturing, and smart grids. Additionally, it addresses challenges in strategy coordination, handling dynamic environments, and improving learning efficiency. This paper proposes several optimization strategies and introduces reservoir group optimization experiments. By comparing with single-agent algorithms, it verifies that multi-agent systems can coordinate multiple reservoirs, enhance convergence speed, and achieve higher power generation efficiency, demonstrating better practical application prospects. Furthermore, the future trends of MARL, including technological advancements, potential applications, and challenges, are discussed. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：