检索结果-内蒙古大学图书馆

LeXNet++: Layer-wise eXplainable ResUNet++ framework for segmentation of colorectal polyp cancer images

Neural Computing and Applications 2025年第1期37卷 213-229页

作者： Das, Surajit Khan, Soumya Suvra Sengupta, Diganta Debashis, De Dept. of Information Technology Meghnad Saha Institute of Technology Behind Urbana Complex Near Ruby General Hospital Anandapur Rd Uchhepota Kolkata700150 India Dept. of Computer Science & Engineering Meghnad Saha Institute of Technology Behind Urbana Complex Near Ruby General Hospital Anandapur Rd Uchhepota Kolkata700150 India Dept. of Computer Science & Engineering Heritage Institute of Technology Chowbaga Road West Bengal Anandapur Kolkata700107 India Dept. of Computer Science & Engineering Maulana Abul Kalam Azad University of Technology West Bengal NH-12 Simhat Haringhata West Bengal Nadia741249 India

Colorectal polyps are benign lesions that develop in the colon and can progress to cancer if left untreated. Clinical observations from medical images are often preferred over computational results due to the lack of trust in the machine learning models, thereby posing serious challenge for the explainability of the results. In order to computationally diagnose colorectal polyps from cancerous images and explain the results, we propose a Layer-wise eXplainable ResUNet++ (LeXNet++) framework for segmentation of the cancerous images, followed by layer-wise explanation of the results. We utilize a publicly accessible dataset that contains of 612 raw images with a resolution of 256×256×3 and an additional 612 clinically annotated and labeled images with a resolution of 256×256×1, which includes the infected region. The LeXNet++ framework comprises of three components—encoder, decoder and the bridge. The encoder and the decoder components each comprise of four layers. Each of the four layers in the encoder and the decoder comprises of 14 and 11 internal sub-layers, respectively. Among the sub-layers of the encoder and the decoder, there are three 3×3 convolutional layers with an additional 3×3 convolution-transpose layer in the decoder. The output of each of the sub-layers has been explained through heatmap generation after each iteration which have been further explained. The encoder and the decoder are connected by the bridge which comprises of three sub-layers. The results obtained from these three sub-layers have also been explained to inculcate trust in the findings. In this study, we have used three models to segment the images, namely UNet, ResUNet, and proposed LeXNet++. LeXNet++ exhibited the best result among the three models in terms of performance;hence, only LeXNet++ was explained layer-wise. Apart from explanation of the results fetched in this study, the performance of the proposed explainable model has been observed to be 2% greater than the existing poly

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

Deepfake Detection Using Multi-Modal Fusion Combined with Attention Mechanism 4

Deepfake Detection Using Multi-Modal Fusion Combined with At...

引用

4th International Conference on Sustainable Expert Systems, ICSES 2024

作者： Shirley, C.P. Berin Jeba Jingle, I. Abisha, M.B. Venkatesan, R. Yashvanth Ram, R.V. Elango, Elakkiya Karunya Institute of Technology and Sciences Dept. of Computer Science and Engineering Coimbatore India Government Arts College for Woman Dept. of computer Science Sivaganga India

ISBN: (数字)9798331540364

ISBN: (纸本)9798331540364

The proliferation of deepfake technology poses a significant challenge to the authenticity of digital content. This research explores the application of multimodal fusion techniques to enhance deepfake detection accuracy. By combining visual and audio features, the proposed method leverages the complementary nature of different data types to detect discrepancies introduced by deepfake manipulation. An attention mechanism is incorporated to focus on salient regions within each modality, further improving detection accuracy. Convolutional Neural Networks (CNNs) and Mel-Frequency Cepstral Coefficients (MFCCs) are employed for feature extraction, followed by feature fusion for deepfake detection. This approach demonstrates the effectiveness of multimodal fusion in combating the evolving threat of deepfake technology. By advancing deepfake detection techniques, this research contributes to safeguarding the integrity of digital content and preserving trust in media. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Real-Time Object Detection from Surveillance using Deep Learning 3

Real-Time Object Detection from Surveillance using Deep Lear...

引用

3rd International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics, IITCEE 2025

作者： Tanisha, A. Tanisha, N. Chaitra, M. Tejasri, P.R. Bnm Institute of Technology Vtu Dept. of Computer Science and Engineering Bangalore India

ISBN: (数字)9798331515911

ISBN: (纸本)9798331515911

Object detection in surveillance systems leverages advanced deep learning techniques to enhance security measures through real-time analysis of dynamic video feeds. This project integrates the YOLOv5 model for detecting weapons in both pre-recorded videos and live camera feeds. The model, trained on a dataset with 4000 images labeled for handguns and knives, utilizes image preprocessing steps such as resizing, normalization, and augmentation to improve detection accuracy. Implementing the system with OpenCV and Tkinter, the application processes video streams, identifying potential threats with high precision rate of 0.85 and recall rate of 0.90. The integration of a graphical user interface ensures user-friendly operation for security personnel. Key outcomes demonstrate the efficacy of deep learning models in real-time object detection, significantly contributing to proactive surveillance and enhanced situational awareness. © 2025 IEEE.

关键词： Graphical user interfaces

来源：评论

学校读者我要写书评

暂无评论

Automated Gait Event Detection in Sports: A Novel Approach Using Ant Colony and XGBoost

Automated Gait Event Detection in Sports: A Novel Approach U...

引用

2024 International Conference on Frontiers of Information technology, FIT 2024

作者： Wahid, Wasim Hanzla, Muhammad Rahman, Hameed Ur Jalal, Ahmad Dept. of Creative Technology Air University Islamabad Pakistan Dept. of Computer Science Air University Islamabad Pakistan Dept. of Computer Gaming Development Air University Islamabad Pakistan

ISBN: (纸本)9798331510503

Advanced techniques for body part detection and marker-less sensor-based cue selection is needed for automated human posture estimation (A-HPE) systems to efficiently identify complicated activity movements. The complicated motions during sports and fitness activities and the differences in lighting conditions make it difficult to identify human activities using vision sensors. This study presents an innovative approach to automatically recognize distinct gait events (GEDs) in sporting activities. Our system integrates techniques for detecting individuals and extracting crucial body points to establish a skeletal framework. Subsequently, we employ analyses on these key points and body features, using methods such as histograms of oriented gradients (HOG) and joint angle computations. To address the variability of sports movements, we utilize two robust algorithms: ant colony optimization and XGBoost. These algorithms facilitate the classification of movements and the assignment of labels to each event. Evaluation on a publicly available dataset of Olympic sports demonstrates that our GED system achieves an accuracy of 88.81%, surpassing existing methodologies. Application for the suggested approach in man-machine interactions include augmented reality, service bots, e-health fitness, and surveillance of security. © 2024 IEEE.

关键词： Ant colony optimization

来源：评论

学校读者我要写书评

暂无评论

TurfIt: An Instant Turf Booking Application

TurfIt: An Instant Turf Booking Application

引用

2025 IEEE International Conference on Computational, Communication and Information technology, ICCCIT 2025

作者： Rajeswari, A.M. Tamilselvan, V. Abishek, M.R.P. Dept. of Computer Science and Engineering Velammal College of Engineering and Technology Madurai India

ISBN: (纸本)9798331512965

As sports grow in popularity need for effective online turf booking applications has also increased. Although some existing applications are faced with numerous challenges such as;restrictions on access, ineffectiveness in scheduling, and over-bookings, leading to frustrations and operational problems. This paper focuses on those problems by the presentation of an online turf booking application to streamline the booking process. The solution offers real-time updates, automatic payment, and safe transactions based on QR code. The more user-friendly the application is, the more efficient it will be. Firebase would be used for real-time data synchronization and the responsiveness of React interface ensures the smooth and conflict-free bookings. This solution would lead to increasing customer satisfaction as well as increasing the booking rates and effective utilization of sports facilities. © 2025 IEEE.

关键词： Customer satisfaction

来源：评论

学校读者我要写书评

暂无评论

Pre-Harvest to Post-Harvest: A Review of AI and IoT Applications in Smart Agriculture and the Prospects of 6G-Enabled IoT Framework 27

Pre-Harvest to Post-Harvest: A Review of AI and IoT Applicat...

引用

27th International Symposium on Wireless Personal Multimedia Communications, WPMC 2024

作者： Bhola, Amit Sharma, Himanshu Sagar, Anil Kumar Kumar, Prabhat Sharda Univesity Dept. of Computer Science and Engg Greater Noida India National Institute of Technology Dept. of Computer Science and Engg Bihar Patna India

ISBN: (纸本)9798350392319

Farmers are increasingly adopting Smart farming worldwide, leveraging various advanced technologies. Artificial Intelligence (AI) is instrumental in driving the evolution of smart agriculture applications. Internet of Things (IoT), edge computing, cloud computing, and big data are among the forefront technologies employed in this field. Agricultural activities typically encompass three phases: pre-harvest, during harvest, and postharvest. The pre-harvest phase involves choosing seeds, preparing the farm, and selecting the crops. Harvesting tasks include crop classification, disease analysis, and pathogen detection. Post-harvesting activities involve storage, cooling, and reaping, among others. This study conducted an in-dept. review of activities in each phase, aiming at shortcomings related to AI-based Machine and Deep learning, IoT, security, datasets, and methodologies employed in existing research. This work also proposes a smart farming framework based on 6G technology, utilizing these identified research gaps. Additionally, the work conducted a comparative analysis between this survey and existing studies, revealing that the survey offers a more comprehensive overview in various aspects. © 2024 IEEE.

关键词： Smart agriculture

来源：评论

学校读者我要写书评

暂无评论

Real-Time Gaze Tracking for Online Examination

Real-Time Gaze Tracking for Online Examination

引用

2025 International Conference on Pervasive Computational Technologies, ICPCT 2025

作者： Bisht, Preyanshu Kumar, Suresh Netaji Subhas University of Technology Dept. of Computer Science and Engineering Delhi India

ISBN: (纸本)9798331508685

The exponential growth in online education has increased the demand for automated systems to ensure academic integrity during online examinations. A real-time proctoring system addresses this need by monitoring a student's eye gaze and head movements during an exam. This system leverages Dlib's pre-trained frontal face detector and 68-point facial landmark predictor to detect facial features and track the direction of eye movement and head position. By analyzing these metrics, the system can flag behaviors associated with potential cheating, such as looking away from the screen for extended periods. This paper discusses the design and implementation of the proctoring system, details the detection algorithms, and evaluates its effectiveness for real-time monitoring. © 2025 IEEE.

关键词： Online systems

来源：评论

学校读者我要写书评

暂无评论

Towards Automated Lip Reading: Developing Marathi Lip Reading Datasets and Neural Network Frameworks 4

Towards Automated Lip Reading: Developing Marathi Lip Readin...

引用

4th International Conference on Intelligent Technologies, CONIT 2024

作者： Kulkarni, Apurva Kirange, Dnyaneshwar Khemachandra Ssbt College of Engineering and Technology Dept. of Computer Engineering Jalgaon India

ISBN: (纸本)9798350349900

This paper introduces an innovative method for automating lip-reading, with a specific focus on the Marathi language. Lip-reading plays a crucial role in aiding those with hearing impairments, but automating it presents significant challenges, especially for languages like Marathi lacking sufficient datasets. To tackle this, we propose a novel approach to automatic lip-reading, accompanied by the development of a specialized Marathi dataset. Leveraging advancements in computer vision and deep learning, our model deciphers linguistic content from lip movements, trained on this dataset. We employ various neural network architectures, including feed-forward, recurrent, and convolutional networks, to extract vital visual features crucial for accurate language interpretation Our primary goal is to provide a robust solution for automated lip-reading tailored specifically to regional languages like Marathi, aiming to enhance accessibility for individuals with hearing impairments, particularly in linguistically diverse contexts. This paper primarily discusses detailed exploration of dataset creation, neural network architectures for lip-reading system, we demonstrate the feasibility and potential impact of our approach. Our study underscores the significance of giving priority to regional languages in technological advancements to promote inclusivity for all individuals. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Scraping Data from Google Maps: Comparisons and Experimentations 15th

Scraping Data from Google Maps: Comparisons and Experimentat...

引用

15th International Conference on Intelligent Computing and Networking, IC-ICN 2024

作者： Thakur, Shivam Jha, Rajan Dalawat, Monika Jha, Prashant Khandare, Anand Dept. of Computer Engineering Thakur College of Engineering and Technology Autonomus Mumbai India

ISBN: (纸本)9789819786305

Web scraping, an automated process that extracts data from websites, is a powerful tool that faces several challenges. Existing scrapers often entail drawbacks, such as subscription fees and limited accessibility for small businesses. Concerns regarding the reliability and accuracy of data arise, especially with respect to real-time content. This study scrutinizes web scraping tools, delving into operational principles, advantages, and limitations, addressing applications in data mining, research, and social media marketing. Two scraping approaches, manual extraction, and automated methods, such as HTML parsing, were explored. Popular tools such as ParseHub, WebHarvy, and Octoparse have been discussed, but concerns about accessibility and accuracy persist. Libraries such as BeautifulSoup, Selenium, and Scrapy offer alternatives to those with programming skills. Although web scraping accelerates data collection, its limitations and potential drawbacks must be carefully considered. This paper focuses on web scraping, a potent tool for extracting data from websites. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Data accuracy

来源：评论

学校读者我要写书评

暂无评论

A Transformer-Based Model for Image Caption Generation with Memory Enhancement

A Transformer-Based Model for Image Caption Generation with ...

引用

2024 IEEE International Women in Engineering (WIE) Conference on Electrical and computer Engineering, WIECON-ECE 2024

作者： Ome, Shadika Afroze Azhar, Tanvir Asaduzzaman East Delta University Dept. of Computer Science & Engineering Chattogram Bangladesh Chittagong University of Engineering & Technology Dept. of Computer Science & Engineering Chattogram Bangladesh

ISBN: (纸本)9798331535476

This study proposes a convolution-free transformer-based method for generating accurate descriptions of images. A Vision Transformer is utilized as the primary encoder, replacing traditional CNN, and a Meshed Memory Transformer is integrated for generating captions. The Vision Transformer splits images into fixed-size patches for extracting visual features, which are further processed by a memory-augmented encoder to integrate prior knowledge. It ensures key features are preserved and reused and the meshed decoder combines the text embedding with the outputs from each encoding layer to generate more contextually rich and accurate captions compared to conventional approaches. The model's performance is evaluated on the Flickr30K and the MSCOCO datasets, demonstrating that the Meshed Memory architecture in our approach improves both the image encoding and caption generation step. Our model outperforms the conventional 'CNN-LSTM' models and 'CNN-Transformer' methods, achieving a 1.02% improvement in METEOR and 1.31% in SPICE on the MSCOCO dataset. Additionally, on the Flickr30k dataset, our approach exhibits superior performance across most metrics, notably a 6.37% increase in the METEOR score. © 2024 IEEE.

关键词： Encoding (symbols)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：