检索结果-内蒙古大学图书馆

Semantic deep learning and adaptive clustering for handling multimodal multimedia information retrieval

Multimedia Tools and Applications 2025年第13期84卷 11795-11831页

作者： Sattari, Saeid Yazici, Adnan Computer Engineering Department Middle East Technical University Ankara06531 Turkey Computer Science Department Nazarbayev University Astana010000 Kazakhstan

Multimedia data encompasses various modalities, including audio, visual, and text, necessitating the development of robust retrieval methods capable of harnessing these modalities to extract and retrieve semantic information from multimedia sources. This paper presents a highly scalable and versatile end-to-end framework for multimodal multimedia information retrieval. The core strength of this system lies in its capacity to learn semantic contexts within individual modalities and across different modalities, achieved through the utilization of deep neural models. These models are trained using combinations of queries and relevant shots obtained from query logs. One of the distinguishing features of this framework is its ability to create shot templates, representing videos that have not been encountered previously. To enhance retrieval performance, the system employs clustering techniques to retrieve shots similar to these templates. To address the inherent uncertainty in multimodal concepts, an improved variant of fuzzy clustering is applied. Additionally, a fusion method incorporating an OWA operator is introduced. This method employs various measures to aggregate ranked lists produced by multiple retrieval systems. The proposed approach leverages parallel processing and transfer learning to extract features from three distinct modalities, ensuring the adaptability and scalability of the framework. To assess its effectiveness and efficiency, the system is rigorously evaluated through experiments conducted on six widely recognized multimodal datasets. Remarkably, our approach outperforms previous studies in the literature on four of these datasets, achieving performance improvements ranging from 1.5% to 10.1% over the best reported results in those studies. The experimental findings, substantiated by statistical tests, conclusively establish the effectiveness of the proposed approach in the field of multimodal multimedia information retrieval. © The Author(s), und

关键词： Fuzzy clustering

来源：评论

学校读者我要写书评

暂无评论

Ensuring safety of vehicular cyber physical systems using machine learning and MQTT

引用

International Journal of Vehicle Information and Communication Systems 2025年第2期10卷 165-183页

作者： Bagga, Neha Kalra, Sheetal Kaur, Parminder Department of Computer Science Guru Nanak Dev University Punjab Amritsar India School of Computer Science Engineering Lovely Professional University Punjab Phagwara India Department of Computer Science and Engineering G.N.D.U. Regional Campus Punjab Jalandhar India

One of the pressing concerns for emerging nations is maintenance of roads, including identification and repair of pavement distress. Previous research has focused on pothole detection and lane identification, with the distress details being shared with drivers via database or an Android application. However, this approach is battery-intensive for sensors in smart vehicles and requires a regular internet connection. To address these issues, we have proposed a model trained using Python and TensorFlow to identify road distress and steep curves with an accuracy of 85.2% and 83.1%, respectively. The simulation uses Geocoder to capture the geographical coordinates of the distress, and the collected data is transferred to other CPS devices in cars using MQTT which outperforms databases and Android applications in terms of efficiency, sensor load, and internet connectivity. Drivers receive alerts within 10 seconds, allowing them to make informed decisions which helps prevent accidents and fatalities on the road. Copyright © 2025 Inderscience Enterprises Ltd.

关键词： Highway accidents

来源：评论

学校读者我要写书评

暂无评论

Predicting heart disease based on an intelligent healthcare monitoring system using HPM-NIA

引用

Multimedia Tools and Applications 2025年第13期84卷 11475-11501页

作者： Alharbi, Meshal Department of Computer Science College of Computer Engineering and Sciences Prince Sattam Bin Abdulaziz University Al-Kharj Saudi Arabia

Before a heart attack happens, treating cardiac patients effectively depends on precise heart disease prediction. A heart disease prediction system for the determination of whether the patient has a heart disease condition or not is to be developed. The paper employs a diverse array of models within the heart disease prediction system, each serving a specific purpose to enhance accuracy and efficiency. Data pre-processing techniques, including normalization, standardization, and missing value removal, ensure the quality and consistency of the dataset. Feature extraction methods such as mean, median, standard deviations, and higher-order statistical features contribute to extracting informative features crucial for prediction. The Multi-Objective Forest Particle Swarm Optimization (MOFPSO) algorithm (hybrid version of Forest Optimization (FO) and Particle Swarm Optimization (PSO)) is introduced for efficient feature selection, balancing predictive accuracy and model complexity. Finally, the prediction model, Naïve Bayes trained with Meliorated Ant Colony Optimization algorithm (NB-MACO), is implemented for its simplicity and effectiveness in handling medical datasets. This integration fine-tunes the Naïve Bayes classifier’s hyperparameters, optimizing its performance and resulting in an accuracy of 91.974%. The collective utilization of these models and techniques ensures the development of a robust and accurate heart disease prediction system. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

OmniPlane: A Recolorable Representation for Dynamic Scenes in Omnidirectional Videos

引用

IEEE Transactions on Visualization and computer Graphics 2025年第5期31卷 3387-3396页

作者： Kou, Simin Zhang, Fang-Lue Nazarenus, Jakob Koch, Reinhard Dodgson, Neil A. Victoria University of Wellington School of Engineering and Computer Science Wellington6012 New Zealand Kiel University Department of Computer Science Kiel24118 Germany

Consumer-level omnidirectional video offers an economically viable means to create virtual reality (VR) assets, enabling users to explore and interact within a fully immersive visual environment. However, editing such videos, particularly those with 360° views and dynamic objects, poses significant challenges. Existing approaches to representing and manipulating omnidirectional content - whether designed for typical 2D perspective imagery or panoramas - often fail to adequately capture the complex spatiotemporal relationships crucial for producing high-quality, editable outputs in dynamic, panoramic settings. To overcome these challenges, we introduce OmniPlane, a novel method that leverages spherical spatiotemporal feature grids to empower the representation and editability of real-world dynamic omnidirectional environments casually captured by commodity omnidirectional cameras. OmniPlane computes spatiotemporal features by fusing vectors or matrices from each learnable spatial and spatiotemporal feature plane within a spherical coordinate system, complemented by a specifically designed weighted sampling strategy respecting the inherent spherical distribution of omnidirectional content. These learned feature planes can be flexibly decomposed into palette-based color bases. This innovative method not only enhances the representation capability of omnidirectional content and dynamics but also enables the recoloring of omnidirectional videos. Extensive experiments and a dedicated user study validate the superior performance of our proposed method in facilitating recolorable representations of dynamic omnidirectional environments. © 1995-2012 IEEE.

关键词： Virtual environments

来源：评论

学校读者我要写书评

暂无评论

Genetic Algorithm-optimized k-nearest Neighbors and Support Vector Machines for Breast Cancer Detection in Resource-constrained Environments

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2025年第3期52卷 848-861页

作者： Alemu, Abebe Girma, Anteneh Abebe, Mesfin Srinivasagan, Ramasamy Adama Science and Technology University Adama Ethiopia Computer Science/Cyber-Security department University of the District of Columbia WashingtonDC United States Computer Science and Engineering department Adama Science and Technology University Adama Ethiopia Computer Engineering department CCSIT King Faisal University Al Hufuf Saudi Arabia

Breast cancer poses a significant global threat, highlighting the urgent need for early detection to reduce mortality rates. Researchers are working to minimize the occurrence of false positives and false negatives, thereby improving the efficiency of breast cancer detection models. To achieve this, they employ advanced techniques such as artificial intelligence, machine learning, deep learning, and computational intelligence. Support vector machines (SVM) and k-nearest neighbors (KNN) are two popular lightweight machine-learning techniques.;however, their effectiveness depends on proper feature selection and parameter tuning. Genetic algorithm optimization provides a solution by intelligently selecting relevant features and fine-tuning parameters, which enhances classification accuracy for early diagnosis. This study demonstrates the effectiveness of a hybrid computational intelligence model that utilizes genetic algorithms for feature selection. The proposed GAKNN-SVM model shows superior performance in detecting breast tumors, utilizing the Wisconsin Breast Cancer Diagnostic Dataset. The results indicate significant improvements, with accuracy, sensitivity, and specificity rates reaching 98.25%, 98.15%, and 98.41%, respectively, based on 171 test samples. Overall, genetic algorithms and machine learning approaches hold great promise for improving breast cancer detection accuracy, ultimately leading to better diagnostic outcomes and reduced mortality rates, especially in resource-constrained environments. © (2025), (International Association of Engineers). All rights reserved.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Enhancing User Experience in AI-Powered Human-computer Communication with Vocal Emotions Identification Using a Novel Deep Learning Method

引用

computers, Materials & Continua 2025年第2期82卷 2909-2929页

作者： Ahmed Alhussen Arshiya Sajid Ansari Mohammad Sajid Mohammadi Department of Computer Engineering College of Computer and Information SciencesMajmaah UniversityAl-Majmaah11952Saudi Arabia Department of Information Technology College of Computer and Information SciencesMajmaah UniversityAl-Majmaah11952Saudi Arabia Department of Computer Science College of Engineering and Information TechnologyOnaizah CollegesQassim51911Saudi Arabia

Voice, motion, and mimicry are naturalistic control modalities that have replaced text or display-driven control in human-computer communication (HCC). Specifically, the vocals contain a lot of knowledge, revealing details about the speaker’s goals and desires, as well as their internal condition. Certain vocal characteristics reveal the speaker’s mood, intention, and motivation, while word study assists the speaker’s demand to be understood. Voice emotion recognition has become an essential component of modern HCC networks. Integrating findings from the various disciplines involved in identifying vocal emotions is also challenging. Many sound analysis techniques were developed in the past. Learning about the development of artificial intelligence (AI), and especially Deep Learning (DL) technology, research incorporating real data is becoming increasingly common these days. Thus, this research presents a novel selfish herd optimization-tuned long/short-term memory (SHO-LSTM) strategy to identify vocal emotions in human communication. The RAVDESS public dataset is used to train the suggested SHO-LSTM technique. Mel-frequency cepstral coefficient (MFCC) and wiener filter (WF) techniques are used, respectively, to remove noise and extract features from the data. LSTM and SHO are applied to the extracted data to optimize the LSTM network’s parameters for effective emotion recognition. Python Software was used to execute our proposed framework. In the finding assessment phase, Numerous metrics are used to evaluate the proposed model’s detection capability, Such as F1-score (95%), precision (95%), recall (96%), and accuracy (97%). The suggested approach is tested on a Python platform, and the SHO-LSTM’s outcomes are contrasted with those of other previously conducted research. Based on comparative assessments, our suggested approach outperforms the current approaches in vocal emotion recognition.

关键词： Human-computer communication(HCC) vocal emotions live vocal artificial intelligence(AI) deep learning(DL) selfish herd optimization-tuned long/short K term memory(SHO-LSTM)

来源：评论

学校读者我要写书评

暂无评论

White shark optimizer via support vector machine for video-based gender classification system

引用

Multimedia Tools and Applications 2025年 1-17页

作者： Oyediran, Mayowa Oyedepo Ajagbe, Sunday Adeola Ojo, Olufemi Samuel Alshahrani, Reem Awodoye, Olufemi O. Adigun, Matthew O. Department of Computer Engineering Ajayi Crowther University Oyo Nigeria Department of Computer Science University of Zululand Kwadlangezwa3886 South Africa Department of Computer Engineering Abiola Ajimobi Technical University Ibadan200255 Nigeria Department of Computer Science College of Computers and IT Taif University P.O.Box 11099 Taif21944 Saudi Arabia Department of Computer Engineering Ladoke Akintola University of Technology Ogbomoso Nigeria

Gender identification from videos is a challenging task with significant real-world applications, such as video content analysis and social behavior research. In this study, we propose a novel approach, the White Shark Optimizer-Support Vector Machine (WSO-SVM), tailored specifically for gender identification from video data. The WSO-SVM integrates the White Shark Optimizer, a bio-inspired optimization algorithm mimicking the hunting behavior of white sharks, with the Support Vector Machine, a powerful machine learning technique for classification. By combining these two methods, we aim to exploit the advantages of both algorithms and enhance gender identification accuracy. To evaluate the performance of the WSO-SVM in gender identification, the work conducted extensive experiments using a diverse dataset of video clips containing individuals of various genders and backgrounds. The work compared the results with conventional SVM-based gender identification and state-of-the-art methods. The findings demonstrate that the WSO-SVM achieves superior accuracy in gender identification compared to traditional SVM-based approaches. The WSO-SVM's ability to efficiently explore the solution space and select optimal SVM parameters contributes to its improved performance. Moreover, the WSO-SVM exhibits robustness in handling variations in lighting conditions, poses, and facial expressions, making it well-suited for real-world video-based gender identification tasks. The outcomes derived from the SVM approach demonstrate that WSO-SVM produced an average FPR of 7.14%, Sensitivity of 93.06%, Specificity of 92.86%, Precision of 91.0%, and overall accuracy of 93.00% in 45.83 s with a recognition time of 45.83 s. © The Author(s) 2025.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

SkinMultiNet: Advancements in Skin Cancer Prediction Using Deep Learning with Web Interface

引用

Biomedical Materials and Devices 2025年第1期3卷 621-637页

作者： Likhon, Md Nur Hosain Rana, Sahab Uddin Akter, Sadeka Ahmed, Md. Shorup Tanha, Khadiza Akter Rahman, Md. Mahbubur Nayeem, Md Emran Hussain Department of Computer Science and Engineering Dhaka International University Dhaka Bangladesh Department of Computer Science and Engineering Dhaka University of Engineering and Technology Gazipur Bangladesh Department of Computer Science and Engineering Bangladesh University of Business and Technology Dhaka Bangladesh

Cancer remains the leading cause of death worldwide, significantly impacting individuals and healthcare systems alike. In recent decades, skin cancer has surged in prevalence compared to other major cancer types. Various factors such as texture, color, morphological characteristics, and structure are employed in categorizing different forms of skin cancer. However, traditional methods of identification often prove time-consuming and costly. Skin cancer classification predominantly relies on machine learning, with the primary method being convolutional neural networks (CNNs). Our ‘SkinMultiNet’ framework, presented in this study and based on transfer learning principles, integrates the InceptionV3 and Xception CNN models for predicting skin cancer using image data. While other machine learning models such as ResNet50, NasNet, and MobileNet were explored, the 'SkinMultiNet' framework demonstrated the most promising outcomes. Utilizing a publicly available dataset comprising 6086 skin images, we trained, tested, and evaluated our models extensively. Proposed system employed a train generator to feed image data into our deep learning CNN models, followed by implementing a learning rate reducer on the datasets within the model. Through rigorous testing and validation procedures, our models successfully processed a substantial volume of skin image data. In contrast to conventional approaches, our proposed architecture offers the potential for more reliable diagnoses, achieving an optimal accuracy rate of 94% in skin cancer prediction. This advancement holds promise for early detection and improved patient outcomes following therapy. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Convolutional neural network Deep learning Django Skin cancer Web interface

来源：评论

学校读者我要写书评

暂无评论

Intrumer:A Multi Module Distributed Explainable IDS/IPS for Securing Cloud Environment

引用

computers, Materials & Continua 2025年第1期82卷 579-607页

作者： Nazreen Banu A S.K.B.Sangeetha Department of Computer Science and Engineering SRM Institute of Science and TechnologyVadapalani CampusChennai600026Tamil NaduIndia

The increasing use of cloud-based devices has reached the critical point of cybersecurity and unwanted network *** environments pose significant challenges in maintaining privacy and *** approaches,such as IDS,have been developed to tackle these ***,most conventional Intrusion Detection System(IDS)models struggle with unseen cyberattacks and complex high-dimensional *** fact,this paper introduces the idea of a novel distributed explainable and heterogeneous transformer-based intrusion detection system,named INTRUMER,which offers balanced accuracy,reliability,and security in cloud settings bymultiplemodulesworking together within *** traffic captured from cloud devices is first passed to the TC&TM module in which the Falcon Optimization Algorithm optimizes the feature selection process,and Naie Bayes algorithm performs the classification of *** selected features are classified further and are forwarded to the Heterogeneous Attention Transformer(HAT)*** this module,the contextual interactions of the network traffic are taken into account to classify them as normal or malicious *** classified results are further analyzed by the Explainable Prevention Module(XPM)to ensure trustworthiness by providing interpretable *** the explanations fromthe classifier,emergency alarms are transmitted to nearby IDSmodules,servers,and underlying cloud devices for the enhancement of preventive *** experiments on benchmark IDS datasets CICIDS 2017,Honeypots,and NSL-KDD were conducted to demonstrate the efficiency of the INTRUMER model in detecting network trafficwith high accuracy for different *** outperforms state-of-the-art approaches,obtaining better performance metrics:98.7%accuracy,97.5%precision,96.3%recall,and 97.8%*** results validate the robustness and effectiveness of INTRUMER in securing diverse cloud environments against sophisticated cyber threats.

关键词： Cloud computing intrusion detection system transformers and explainable artificial intelligence(XAI)

来源：评论

学校读者我要写书评

暂无评论

Enhancing the efficiency of lung cancer screening: predictive models utilizing deep learning from CT scans

引用

Neural Computing and Applications 2025年 1-19页

作者： Tawfeek, Medhat A. Alrashdi, Ibrahim Alruwaili, Madallah Shaban, Warda M. Talaat, Fatma M. Department of Computer Science Faculty of Computers and Information Menoufia University Shebin Elkom32511 Egypt Department of Computer Science College of Computer and Information Sciences Jouf University Sakakah Saudi Arabia Department of Computer Engineering and Networks College of Computer and Information Sciences Jouf University Sakakah Saudi Arabia Department of Communication and Electronics Engineering Nile Higher Institute for Engineering and Technology Mansoura Egypt Faculty of Artificial Intelligence Kafrelsheikh University Kafrelsheikh Egypt Faculty of Computer Science and Engineering New Mansoura University Gamasa35712 Egypt

Lung cancer is the most lethal form of cancer. This paper introduces a novel framework to discern and classify pulmonary disorders such as pneumonia, tuberculosis, and lung cancer by analyzing conventional X-ray and CT scan images called lung cancer risk prediction (LCRP) model. LCRP has four modules, namely data collection and preprocessing, data augmentation module, image segmentation module, and prediction module. Actually, LCRP employs three deep learning models;sequential model, functional model, and transfer model on publicly available training datasets. Convolutional neural networks (CNNs) have emerged as a highly effective field in machine learning, particularly for image datasets in the field of biomedical applications. The primary goal is to validate these models by comparing their performance with other models in order to determine their effectiveness in addressing challenging datasets. Our research has revealed a noteworthy enhancement in the efficiency of binary and multi-class classification using mask R-CNN image segmentation. During the model training process, a combination of Adam and stochastic gradient descent dual optimizers has been used to improve performance. LCRP have outperformed current pre-trained models by minimizing training parameters, computational costs, and overhead. It introduces 98.5% accuracy, 88.7% specificity, 89% sensitivity, 89.2% precision, and 89.09% F-measure. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.

关键词： Lung cancer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：