Anonymisation technique has been extensively studied and widely applied for privacy-preserving data publishing. However, most existing methods ignore personal anonymity requirements. In these approaches, the microdata...
详细信息
Sleep staging is the basis of sleep quality assessment. In the process of scoring each sleep stage, some automatic sleep staging models often fail to effectively capture the more accurate long-range correlation coupli...
详细信息
This study focuses on the application of large models to deal with imbalanced data problems in text classification. In view of the central position of text in web data and the negative impact of class imbalance on cla...
详细信息
ISBN:
(数字)9798350385557
ISBN:
(纸本)9798350385564
This study focuses on the application of large models to deal with imbalanced data problems in text classification. In view of the central position of text in web data and the negative impact of class imbalance on classifier performance, researchers have explored the method of using large models to generate high-quality minority class samples to enhance model performance. This paper reviews the technical progress of machine learning, deep learning, and large language models and their applications in text classification tasks. Although large models perform well in complex tasks due to their excellent language understanding ability, traditional machine learning and deep learning methods are popular in text classification scenarios that require fast response due to their simple structure and higher computational efficiency. This study proposes a data augmentation technique inspired by SMOTE, which uses a large language model combined with a simple prompt engineering strategy to generate high-quality minority samples. The experimental results show that the proposed method significantly improves the macro average precision, recall and F1 score on multiple text classification models, and effectively alleviates the challenge of class imbalance.
The collection and labeling of data is a labor-intensive task and this has given rise to a large market for data crowdsourcing transactions. While there are many publicly available video datasets, task-specific data i...
详细信息
The collection and labeling of data is a labor-intensive task and this has given rise to a large market for data crowdsourcing transactions. While there are many publicly available video datasets, task-specific data is still scarce and requires Customized annotation services are required. Even with many excellent auxiliary models and tools, video annotation is still a lengthy and time-consuming task. To address these challenges, this paper provides a new and effective annotation method in which the annotator no longer just provides annotations, but also plays the role of a reviewer to review the annotation results of other annotators. This method focuses on surveillance video data, in addition, it also supports adding additional custom tasks (e.g., action tagging, person relationship recognition, video summarization, etc.). And in this paper we mainly consider the additional custom temporal action annotation task. In this paper, we develop rules for filtering frames or segments that need to be re-labeled based on the temporal information of the model inference results and rely on the correlation between target and time to determine the task relevance, and asynchronously assign the task to different annotators for and dynamically portray the ability of the annotators while annotation is in progress, so as to allocate tasks to achieve annotation and mutual review of annotators. We have experimentally demonstrated that this method can reduce costs and improve labeling accuracy.
Object detection is a hot issue in the field of computer vision, which is widely used in intelligent video surveillance, medical image analysis, and practice in the fields of military strategy. Previous object detecti...
详细信息
With the gradual development of the Internet of Things, people rely more and more on the Internet of Things devices to assist their daily lives. The permissions of IoT devices make it easy for them to rescue or destro...
详细信息
With the gradual development of the Internet of Things, people rely more and more on the Internet of Things devices to assist their daily lives. The permissions of IoT devices make it easy for them to rescue or destroy our lives. This paper proposes a new scalable IoT security framework (ISDF), which uses the unforgeable physical layer characteristics of IoT devices as the verification basis to identify illegal devices. This framework combined with the identifier generation algorithm to generate a unique trusted identifier. Under the verification of more than a dozen devices such as bluetooth, wifi, and voice, the identifiers we generate have high robustness and stability, the accuracy of verification is more than 93%. Compared with existing solutions, our framework has the advantages of supporting multimodality and high verification success rate.
Diffusion models are powerful generative models, and this capability can also be applied to discrimination. The inner activations of a pre-trained diffusion model can serve as features for discriminative tasks, namely...
In this paper, we consider the joint beamforming design for simultaneous sensing and communication in a wireless multi-user system. Different from the existing works that mostly are for single target, we consider sens...
详细信息
In this paper, we consider the joint beamforming design for simultaneous sensing and communication in a wireless multi-user system. Different from the existing works that mostly are for single target, we consider sensing the channel parameters of multiple targets while communicating with multiple users. The design goal is to minimize a weighted sum of the Cramer-Rao bounds (CRB) of target parameters subject to the communication sum rate and transmission power constraints. While the classical weighted minimum mean square error (WMMSE) and semidefinite relaxation (SDR) can be used to handle the problem, we propose to reformulate the problem into a max-min form, by leveraging the tightness of SDR, and solve it by a low-complex first-order method. Numerical results not only demonstrate the computation efficiency of the proposed algorithm but also its effectiveness in enhancing the sensing performance in practice.
Monitoring mechanical equipment is crucial for industrial efficiency and safety. Anomalous sound detection faces challenges due to limited labeled data and varying acoustic conditions. Supervised methods require exten...
详细信息
ISBN:
(数字)9798331513054
ISBN:
(纸本)9798331513061
Monitoring mechanical equipment is crucial for industrial efficiency and safety. Anomalous sound detection faces challenges due to limited labeled data and varying acoustic conditions. Supervised methods require extensive labeled data, which is often scarce. Unsupervised methods, which model the distributions of normal data, are more suitable when normal data is abundant but anomalies are rare. In this paper, we present SAM-AE-GAN, an unsupervised anomaly detection model that combines self-attention, autoencoders, GANs, and spectral perturbation for data augmentation. The self-attention mechanism captures long-range dependencies, autoencoders enable efficient data representation, and GANs enhance generative performance. Spectral perturbation enhances adaptability, improving robustness and accuracy. Experimental results demonstrate the effectiveness of SAM-AE-GAN across five machine types in the 2022 Challenge TASK 2 dataset, validating its capability in complex industrial environments. This model provides an effective solution for anomaly detection and predictive maintenance, even in scenarios with limited labeled data.
The current image-text cross-modal retrieval faces challenges due to the heterogeneous nature of different modalities. This paper proposes an improved model based on multi-scale modal distance learning to enhance the ...
详细信息
ISBN:
(数字)9798331513054
ISBN:
(纸本)9798331513061
The current image-text cross-modal retrieval faces challenges due to the heterogeneous nature of different modalities. This paper proposes an improved model based on multi-scale modal distance learning to enhance the performance of coarse-grained and fine-grained image-text matching. In the image encoding part, an improved ResNet architecture is adopted, utilizing upsampling techniques to better preserve key semantic information. For text encoding, BERT and Bi-LSTM models are combined to extract shallow and deep textual features, respectively. We design a coarse-fine granularity interaction mechanism, leveraging a bidirectional stacked Transformer to achieve local similarity interaction between image regions and text words, while also using Bi-LSTM-extracted contextual features to interact with global image features for global similarity matching. Additionally, this paper introduces a novel multi-scale modal contrastive loss, which dynamically adjusts the distance between samples of different modalities, thereby improving the model's sensitivity to semantic similarity. Experimental results on the COCO 5K and Flickr30K datasets demonstrate that the proposed method outperforms existing mainstream methods in both retrieval accuracy and efficiency, achieving complementary advantages between coarse- and fine-grained features. This provides an effective solution and new research direction for cross-modal retrieval.
暂无评论