检索结果-内蒙古大学图书馆

2024 IEEE International Conference on Robotics and Biomimetics, ROBIO 2024

作者： Zhao, Lei Zhao, Hanyuan Feng, Zhiyu Wang, Sicheng Liu, Huaping Sun, Fuchun Fang, Bin Yin, Jianqin Tsinghua University Department of Computer Science and Technology Beijing100083 China School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing1000876 China School of Software Beijing100083 China Tandon School of Engineering New York University China

ISBN: (纸本)9781665481090

In this article, we have designed a novel variable-stiffness soft gripper that combines the advantages of both flexible and rigid grippers. It is capable of performing two different tasks: pinching and enveloping. The soft gripper comprises spring steel sheets, memory alloy sheets, tendons, and servo motors, offering the advantages of low cost and simple structure. This variable-stiffness soft gripper can freely switch between the pinching and enveloping by heating the memory alloy sheets and controlling the tendons with servo motors. Finally, to validate the effectiveness of the soft gripper and the algorithm, we constructed experimental platforms for autonomous grasping in virtual and real environments. Experimental results demonstrate that our variable-stiffness soft gripper exhibits excellent grasping performance, stably pinching, and enveloping objects of various shapes and materials, and the achieved grasping success rate is 90%, surpassing the performance of using only precision pinching or enveloping. © 2024 IEEE.

关键词： Grippers

来源：评论

学校读者我要写书评

暂无评论

Efficient Region Proposal Extraction of Small Lung Nodules Using Enhanced VGG16 Network Model

Efficient Region Proposal Extraction of Small Lung Nodules U...

引用

Annual IEEE Symposium on computer-Based Medical Systems

作者： Yadollah Zamanidoost Nada Alami-Chentoufi Tarek Ould-Bachir Sylvain Martel Department of Computer and Software Engineering MOTCE Laboratory Polytechnique Montréal Montréal QC Canada Department of Computer and Software Engineering Nanorobotics Laboratory Polytechnique Montréal Montréal QC Canada

The efficiency of state-of-the-art convolutional networks trained to detect lung cancer nodules depends on their feature extraction model. Various feature extraction models have been proposed based on convolutional networks, such as VGG-Net, or ResNet. It has been demonstrated that such models effectively extract features from objects in an image. However, their efficacy is limited when the objects of interest are very small, such as lung nodules. One of the widely used feature extraction models for detecting small objects is the VGG16 network. The model, which has a small kernel of $\mathbf{3}\times \mathbf{3}$ and optimal layers, can extract the features of small objects with reasonable accuracy. In this article, feature maps are created by combining the last three layers of the VGG16 network to extract features of various sizes of nodules. This study utilizes a Region Proposal Network (RPN) to compare the accuracy of the feature map created in the proposed method and the original VGG16. An RPN is a fully-convolutional network that simultaneously predicts object bounds and objectness scores at each position. RPNs are trained end-to-end to generate high-quality region proposals, which Faster R-CNN uses for detection. In this article, we select 300, 1, 000 and 2, 000 regions chosen by the RPN network for each method; then, we calculate the recall for different Intersection over Union (IoU) ratios with ground-truth boxes. The results show that the feature map of the proposed method works more optimally than the feature map of different layers of VGG16 for extracting various sizes of nodules. Also, by reducing the number of selected region proposals, the recall of the proposed method has fewer changes than other methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Search and Recommendation Systems with Metadata Extensions

Search and Recommendation Systems with Metadata Extensions

引用

International Conference on Advanced Communication Technology (ICACT)

作者： Woo-Hyeon Kim Joo-Chang Kim Division of AI Computer Science and Computer Engineering Kyonggi University South Korea Contents Convergence Software Research Institute Kyonggi University South Korea

This paper proposes an AI-based video metadata extension model to overcome the limitations of video search and recommendation systems in the multimedia industry. Current video searches and recommendations utilize pre-added metadata. Metadata includes filenames, keywords, tags, genres, etc. This makes it impossible to make direct predictions about the content of a video without pre-added metadata. These platforms also analyze your previous search history, viewing history, etc. to understand your interests in order to serve you personalized videos. This may not reflect the actual content and may raise privacy concerns. In addition, recommendation systems suffer from a cold start problem, which is the lack of an initial target, as well as a bubble effect. Therefore, this study proposes a search and recommendation system by expanding metadata in videos using techniques such as shot boundary detection, speech recognition, and text mining. The proposed method selects the main objects required by the recommendation system based on the object frequency and extracts the corresponding objects from the video frame by frame. In addition, we extract the speech from the video separately, convert the speech to text to extract the script and apply text mining techniques to the extracted script to quantify it. Then, we synchronize the object frequency and the transcript to create a single contextual data. After that, we group videos and clips based on the contextual data and index them. Finally, we utilize Shot Boundary Detection to segment videos based on their content. To ensure that the generated contextual data is appropriate for the video, the proposed model compares the extracted script with the video's subtitle data to check and calibrate its accuracy. The model can then be fine-tuned by tuning and cross-validating the hyperparameter to improve its performance. These models can be incorporated into a variety of content discovery and recommendation platforms. By using expanded

关键词：

来源：评论

学校读者我要写书评

暂无评论

DurFlex-EVC: Duration-Flexible Emotional Voice Conversion Leveraging Discrete Representations Without Text Alignment

引用

IEEE Transactions on Affective Computing 2025年

作者： Oh, Hyung-Seok Lee, Sang-Hoon Cho, Deok-Hyeon Lee, Seong-Whan Korea University Department of Artificial Intelligence Seongbuk-gu Seoul02841 Korea Republic of Ajou University Department of Software and Computer Engineering Korea Republic of

Emotional voice conversion (EVC) involves modifying various acoustic characteristics, such as pitch and spectral envelope, to match a desired emotional state while preserving the speaker's identity. Existing EVC methods often rely on text transcriptions or time-alignment information and struggle to handle varying speech durations effectively. In this paper, we propose DurFlex-EVC, a duration-flexible EVC framework that operates without the need for text or alignment information. We introduce a unit aligner that models contextual information by aligning speech with discrete units representing content, eliminating the need for text or speech-text alignment. Additionally, we design a style autoencoder that effectively disentangles content and emotional style, allowing precise manipulation of the emotional characteristics of the speech. We further enhance emotional expressiveness through a hierarchical stylize encoder that applies the target emotional style at multiple hierarchical levels, refining the stylization process to improve the naturalness and expressiveness of the converted speech. Experimental results from subjective and objective evaluations demonstrate that our approach outperforms baseline models, effectively handling duration variability and enhancing emotional expressiveness in the converted speech. © 2010-2012 IEEE.

关键词： Speech enhancement

来源：评论

学校读者我要写书评

暂无评论

Comparative Analysis of Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU) and Transformer Models in Predicting Stock Prices

Comparative Analysis of Long Short-Term Memory (LSTM), Gated...

引用

IEEE International Conference on Emerging & Sustainable Technologies for Power & ICT in a Developing Society (NIGERCON)

作者： Chinecherem Umezuruike Deborah Olaniyan Julius Olaniyan Abidemi Emmanuel Adeniyi Adedoyin Oyebade David Abaneme Department of Software Engineering Bowen University Iwo Department of Computer Science Bowen University Iwo Department of Computer Science University of Ilorin Ilorin Nigeria

ISBN: (数字)9798331542559

ISBN: (纸本)9798331542566

This research explores the application of LSTM, GRU and Transformer models for predicting stock prices, aiming to enhance accuracy in financial forecasting. Stock price prediction is crucial for investment decision-making, yet challenging due to market volatility and complex patterns. The objectives are to evaluate the performance of LSTM, GRU and Transformer models using key metrics such as test loss, MAE, and MSE, and to compare their predictive capabilities. The LSTM model demonstrates robust performance with low test loss and MAE, indicating precise predictions and effective pattern recognition in financial data. In contrast, the Transformer model also shows promising results with relatively low test loss and MAE, albeit with larger errors in MSE and MAE metrics. Both models highlight the potential for accurate stock price prediction, suggesting avenues for future research to optimize model performance and reliability in financial forecasting applications. The experimental results show that GRU Outperformed LSTM and Transformer with an MSE of 0.0008, MAE of 0.0023, and high test accuracy of 0.9833.

关键词： Measurement Analytical models Accuracy Refining Predictive models Transformers Pattern recognition Reliability Forecasting Long short term memory

来源：评论

学校读者我要写书评

暂无评论

PROPER: Personality Recognition based on Public Speaking using Electroencephalography Recordings

引用

IEEE Access 2024年 1-1页

作者： Majid, Muhammad Butt, Amna Rauf Nizami, Imran Fareed Arsalan, Aamir Ryu, Jihyoung Department of Computer Engineering University of Engineering and Technology Taxila Taxila Pakistan Department of Electrical Engineering Bahria University Islamabad Pakistan Department of Software Engineering Fatima Jinnah Women University Rawalpindi Pakistan Gwangju Korea

Personality trait recognition is an important psychological paradigm to understand the differences in people’s behavior. This paper presents a new dataset, which we dubbed as PROPER (Personality Recognition based On Public Speaking using Electroencephalography Recordings) that connects the personality traits of an individual with public speaking activity via electroencephalography (EEG) signals. EEG data of 40 healthy individuals is recorded before, during, and after public speaking activity using Muse headband. A score from the Big Five Personality Trait questionnaire is used to label the participant’s EEG data. A statistical analysis of EEG signals for each personality trait during different phases of the experiment is performed. The personality recognition process involves data acquisition, pre-processing, feature extraction and selection, and classification. Five feature groups are extracted from the frequency bands of EEG data of each channel. Feature selection is applied to the extracted features via the wrapper method. Support vector machine, the Naive Bayes, and multilayer perceptron (MLP) are used to classify the personality traits. An average F1-score of 0.95 for extroversion, 0.94 for openness to experience, 0.90 for conscientiousness, 0.84 for neuroticism, and 0.85 for agreeableness is achieved using the MLP classifier using pre-stimulus, during activity, and post-stimulus EEG data respectively. Authors

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Enhancing Choroidal Nevus Position Identification through CNN-Based Segmentation of Eye Fundus Images 46

Enhancing Choroidal Nevus Position Identification through CN...

引用

46th Annual International Conference of the IEEE engineering in Medicine and Biology Society, EMBC 2024

作者： Eshragh, Mohammadmahdi A.mohammed, Emad Far, Behrouz Crump, Trafford Weis, Ezekiel University of Calgary Dept of Electrical & Software Engineering Calgary Canada Wilfrid Laurier University Dept of Physics and Computer Science Waterloo Canada University of Calgary Dept of Surgery Calgary Canada University of Alberta Dept of Ophthalmology & Visual Science Edmonton Canada

ISBN: (纸本)9798350371499

Diagnosing choroidal nevus in color fundus images is challenging for clinicians not regularly practicing it. Machine learning (ML) has proven effective in detecting and analyzing such abnormalities with high accuracy and efficiencyThis research is part of a larger project to develop a decision support system for choroidal nevus diagnosis, focusing on creating a segmentation algorithm to identify key areas in color fundus images. The study evaluates and compares the efficacy of various convolutional neural network (CNN) segmentation models, a crucial step for improved image analysis accuracyFundus images from the Alberta Ocular Brachytherapy Program, including healthy and choroidal nevus-affected eyes, were used. An ocular oncologist provided a ground truth mask dataset for training the models. Preprocessing improved image features, and multiple CNN models segmented the images to detect lesions. Model performance was compared to find the most accurate and efficient approach, with external validation using a separate test set and ophthalmology expertsFour CNN models - U-net, Residual U-net, Attention U-net, and a voting-based Ensemble - were developed for segmentation. Their effectiveness was measured by accuracy metrics, achieving Dice Coefficient scores of 85.02%, 85.66%, 86.89%, and 87.7% respectively. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Optimizing Variable-Strength Combinatorial Test Suite Generation via Fuzzy Adaptive Sine Cosine Algorithm (FASCA) 1

Optimizing Variable-Strength Combinatorial Test Suite Genera...

引用

1st International Conference on Cyber Security and Computing, CyberComp 2024

作者： Zabidi, Nur Syabila Ibrahim, Noraini Rejab, Mazidah Mat Mamat, MasrulEhsan Nazir, Sumaira Mat Tuselim, Nurul Hawani Faculty of Computer Science and Information Technology Parit Raja Batu Pahat Malaysia National University of Modern Languages Department of Software Engineering Islamabad Pakistan Industrial Research Management Centre SIRIM Berhad Business Intelligence Section Shah Alam Malaysia

ISBN: (纸本)9798350387728

Combinatorial test suite generation is a critical aspect of software testing, particularly for systems with variable-strength interactions. Traditional optimization algorithms often struggle to efficiently generate minimal test suites while maintaining coverage. This research introduces the Fuzzy Adaptive Sine Cosine Algorithm (FASCA), a variant of the Sine Cosine Algorithm (SCA), tailored specifically for optimizing variable-strength combinatorial test suite generation. FASCA incorporates fuzzy adaptation mechanisms to dynamically balance exploration and exploitation, addressing the limitations of standard SCA in handling diverse interaction strengths. Experimental evaluations demonstrate that FASCA significantly reduces test suite size and execution time compared to the original SCA, offering a more efficient solution for combinatorial test optimization. These results highlight FASCA's potential as a robust tool for enhancing the efficiency of software testing processes. © 2024 IEEE.

关键词： software testing

来源：评论

学校读者我要写书评

暂无评论

F-IKOS: An Abstract Interpretation-based Static Analyzer for Fortran Programs 12

F-IKOS: An Abstract Interpretation-based Static Analyzer for...

引用

12th International Workshop on Quantitative Approaches to software Quality, QuASoQ 2024

作者： Zou, Sheng Chen, Liqian Fan, Guangsheng Huang, Renjie Yin, Banghu College of Computer Science and Technology National University of Defense Technology Changsha410073 China State Key Laboratory of Complex & Critical Software Environment Changsha410073 China College of Systems Engineering National University of Defense Technology Changsha410073 China

The Fortran programming language is widely utilized in numerical computation and scientific computing. Fortran programs are prone to potential runtime errors related to numerical properties due to the large number of numerical operations. In this paper, we present F-IKOS, an abstract interpretation-based static analyzer for Fortran programs on top of IKOS, which soundly handles floating-point types in Fortran programs. Firstly, we translate Fortran programs to LLVM IR using compiler front-end Flang. After that, we extend IKOS to support sound floating-point analysis and then employ it to analyze the translated LLVM IR. Particularly, when analyzing floating-point types in programs, we first abstract floating-point expressions into real-number expressions with interval coefficients, and then linearize these expressions into real-number expressions with scalar coefficients. These linear expressions are subsequently handled by abstract domains originally designed for real-number types to produce sound analysis results. We have conducted experiments on representative Fortran programs to show the efficiency and effectiveness of F-IKOS. The experimental results are encouraging: F-IKOS soundly analyzes runtime errors in complex programs, outperforming other analyzers. © 2024 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0).

关键词： FORTRAN (programming language)

来源：评论

学校读者我要写书评

暂无评论

Medical Transformer with Mix Mask Generation for Thorax Disease Classification

引用

IEEE Transactions on Multimedia 2025年

作者： Liu, Ziyi Wang, Zengmao Du, Bo Wuhan University National Engineering Research Center for Multimedia Software School of Computer Science Artificial Intelligence Institute of Wuhan University Wuhan430072 China

Chest X-ray images have been highly involved in clinical diagnosis and treatment planning for thoracic disease. The process of medical images has attracted great attention in the machine learning community. However, the labeled medical images are limited and the regions of lesions are usually much smaller in the image. Most of the existing methods are prone to learning the spurious correlation for classification, resulting in poor generalization. In this paper, we propose a medical generation transformer network based on self-supervised learning and the adversarial strategy to capture the discriminative label-relevant regions with lesions in the images by extending the Chest X-ray images. In the proposed method, we first localize the label-relevant regions in each transformer layer. Then we keep the label-relevant regions to mask the image and construct the masked image with self-supervised learning. Thus we can generate more images to fine-tune the classification network with masked images that keep the label-relevant regions. Since the generated images are usually noisy to fine-tune the classification network, we adopt the adversarial probabilities to weight the importance of each generated image for training. Experimental results on two large-scale and popular chest X-ray datasets show that the proposed method can efficiently leverage the location of lesions to improve the performance of classification. © 2025 IEEE.

关键词： Self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：