Internet of Things (IoT) enabled Wireless Sensor Networks (WSNs) is not only constitute an encouraging research domain but also represent a promising industrial trend that permits the development of various IoT-based ...
详细信息
In recent years, mental health issues have profoundly impacted individuals’ well-being, necessitating prompt identification and intervention. Existing approaches grapple with the complex nature of mental health, faci...
详细信息
In recent years, mental health issues have profoundly impacted individuals’ well-being, necessitating prompt identification and intervention. Existing approaches grapple with the complex nature of mental health, facing challenges like task interference, limited adaptability, and difficulty in capturing nuanced linguistic expressions indicative of various conditions. In response to these challenges, our research presents three novel models employing multi-task learning (MTL) to understand mental health behaviors comprehensively. These models encompass soft-parameter sharing-based long short-term memory with attention mechanism (SPS-LSTM-AM), SPS-based bidirectional gated neural networks with self-head attention mechanism (SPS-BiGRU-SAM), and SPS-based bidirectional neural network with multi-head attention mechanism (SPS-BNN-MHAM). Our models address diverse tasks, including detecting disorders such as bipolar disorder, insomnia, obsessive-compulsive disorder, and panic in psychiatric texts, alongside classifying suicide or non-suicide-related texts on social media as auxiliary tasks. Emotion detection in suicide notes, covering emotions of abuse, blame, and sorrow, serves as the main task. We observe significant performance enhancement in the primary task by incorporating auxiliary tasks. Advanced encoder-building techniques, including auto-regressive-based permutation and enhanced permutation language modeling, are recommended for effectively capturing mental health contexts’ subtleties, semantic nuances, and syntactic structures. We present the shared feature extractor called shared auto-regressive for language modeling (S-ARLM) to capture high-level representations that are useful across tasks. Additionally, we recommend soft-parameter sharing (SPS) subtypes-fully sharing, partial sharing, and independent layer-to minimize tight coupling and enhance adaptability. Our models exhibit outstanding performance across various datasets, achieving accuracies of 96.9%, 97.
Accidents caused by drivers who exhibit unusual behavior are putting road safety at ever-greater risk. When one or more vehicle nodes behave in this way, it can put other nodes in danger and result in potentially cata...
详细信息
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In thi...
详细信息
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In this paper, we conducted a comprehensive evaluation of large multimodal models, such as GPT4V and Gemini, in various text-related visual tasks including text recognition, scene text-centric visual question answering(VQA), document-oriented VQA, key information extraction(KIE), and handwritten mathematical expression recognition(HMER). To facilitate the assessment of optical character recognition(OCR) capabilities in large multimodal models, we propose OCRBench, a comprehensive evaluation benchmark. OCRBench contains 29 datasets, making it the most comprehensive OCR evaluation benchmark available. Furthermore, our study reveals both the strengths and weaknesses of these models, particularly in handling multilingual text, handwritten text, non-semantic text, and mathematical expression *** importantly, the baseline results presented in this study could provide a foundational framework for the conception and assessment of innovative strategies targeted at enhancing zero-shot multimodal *** evaluation pipeline and benchmark are available at https://***/Yuliang-Liu/Multimodal OCR.
People who have trouble communicating verbally are often dependent on sign language,which can be difficult for most people to understand,making interaction with them a difficult *** Sign Language Recognition(SLR)syste...
详细信息
People who have trouble communicating verbally are often dependent on sign language,which can be difficult for most people to understand,making interaction with them a difficult *** Sign Language Recognition(SLR)system takes an input expression from a hearing or speaking-impaired person and outputs it in the form of text or voice to a normal *** existing study related to the Sign Language Recognition system has some drawbacks,such as a lack of large datasets and datasets with a range of backgrounds,skin tones,and *** research efficiently focuses on Sign Language Recognition to overcome previous *** importantly,we use our proposed Convolutional Neural Network(CNN)model,“ConvNeural”,in order to train our ***,we develop our own datasets,“BdSL_OPSA22_STATIC1”and“BdSL_OPSA22_STATIC2”,both of which have ambiguous backgrounds.“BdSL_OPSA22_STATIC1”and“BdSL_OPSA22_STATIC2”both include images of Bangla characters and numerals,a total of 24,615 and 8437 images,***“ConvNeural”model outperforms the pre-trained models with accuracy of 98.38%for“BdSL_OPSA22_STATIC1”and 92.78%for“BdSL_OPSA22_STATIC2”.For“BdSL_OPSA22_STATIC1”dataset,we get precision,recall,F1-score,sensitivity and specificity of 96%,95%,95%,99.31%,and 95.78%***,in case of“BdSL_OPSA22_STATIC2”dataset,we achieve precision,recall,F1-score,sensitivity and specificity of 90%,88%,88%,100%,and 100%respectively.
Effective management of electricity consumption (EC) in smart buildings (SBs) is crucial for optimizing operational efficiency, cost savings, and ensuring sustainable resource utilization. Accurate EC prediction enabl...
详细信息
Deep learning methods have played a prominent role in the development of computer visualization in recent years. Hyperspectral imaging (HSI) is a popular analytical technique based on spectroscopy and visible imaging ...
详细信息
In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable *** predictivemodels for thyroid cancer enhan...
详细信息
In the era of advanced machine learning techniques,the development of accurate predictive models for complex medical conditions,such as thyroid cancer,has shown remarkable *** predictivemodels for thyroid cancer enhance early detection,improve resource allocation,and reduce ***,the widespread adoption of these models in clinical practice demands predictive performance along with interpretability and *** paper proposes a novel association-rule based feature-integratedmachine learning model which shows better classification and prediction accuracy than present *** study also focuses on the application of SHapley Additive exPlanations(SHAP)values as a powerful tool for explaining thyroid cancer prediction *** the proposed method,the association-rule based feature integration framework identifies frequently occurring attribute combinations in the *** original dataset is used in trainingmachine learning models,and further used in generating SHAP values *** the next phase,the dataset is integrated with the dominant feature sets identified through association-rule based *** new integrated dataset is used in re-training the machine learning *** new SHAP values generated from these models help in validating the contributions of feature sets in predicting *** conventional machine learning models lack interpretability,which can hinder their integration into clinical decision-making *** this study,the SHAP values are introduced along with association-rule based feature integration as a comprehensive framework for understanding the contributions of feature sets inmodelling the *** study discusses the importance of reliable predictive models for early diagnosis of thyroid cancer,and a validation framework of *** proposed model shows an accuracy of 93.48%.Performance metrics such as precision,recall,F1-score,and the area un
The manual process of evaluating answer scripts is strenuous. Evaluators use the answer key to assess the answers in the answer scripts. Advancements in technology and the introduction of new learning paradigms need a...
详细信息
Eye health has become a global health concern and attracted broad *** the years,researchers have proposed many state-of-the-art convolutional neural networks(CNNs)to assist ophthalmologists in diagnosing ocular diseas...
详细信息
Eye health has become a global health concern and attracted broad *** the years,researchers have proposed many state-of-the-art convolutional neural networks(CNNs)to assist ophthalmologists in diagnosing ocular diseases efficiently and ***,most existing methods were dedicated to constructing sophisticated CNNs,inevitably ignoring the trade-off between performance and model *** alleviate this paradox,this paper proposes a lightweight yet efficient network architecture,mixeddecomposed convolutional network(MDNet),to recognise ocular *** MDNet,we introduce a novel mixed-decomposed depthwise convolution method,which takes advantage of depthwise convolution and depthwise dilated convolution operations to capture low-resolution and high-resolution patterns by using fewer computations and fewer *** conduct extensive experiments on the clinical anterior segment optical coherence tomography(AS-OCT),LAG,University of California San Diego,and CIFAR-100 *** results show our MDNet achieves a better trade-off between the performance and model complexity than efficient CNNs including MobileNets and ***,our MDNet outperforms MobileNets by 2.5%of accuracy by using 22%fewer parameters and 30%fewer computations on the AS-OCT dataset.
暂无评论