Reversible data hiding in encrypted domain (RDH-ED) fortifies data security and privacy safeguards while upholding the original data’s integrity and accessibility. Current research on RDH-ED focuses on 2D images, whi...
详细信息
Presently, because of the development of deep learning technology, there has been increasingly more attention on state-of-The-Art masking and mapping based speech enhancement methods. However, traditional speech enhan...
详细信息
This study is based on the ICASSP 2025 signalprocessing Grand Challenge’s Accelerometer-Based Person-in-Bed Detection Challenge, which aims to determine bed occupancy using accelerometer signals. The task is divided...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
This study is based on the ICASSP 2025 signalprocessing Grand Challenge’s Accelerometer-Based Person-in-Bed Detection Challenge, which aims to determine bed occupancy using accelerometer signals. The task is divided into two tracks: "in bed" and "not in bed" segmented detection and streaming detection, facing challenges such as individual differences, posture variations, and external disturbances. We propose a spectral-temporal fusion-based feature representation method with mixup data augmentation, and adopt Intersection over Union (IoU) loss to optimize detection accuracy. In the two tracks, our method achieved outstanding results of 100.00% and 95.55% in detection scores, securing first place and third place, respectively.
CNNs(Convolutional Neural Networks) have a good performance on most classification tasks,but they are vulnerable when meeting adversarial *** and design of highly aggressive adversarial examples can help enhance the s...
CNNs(Convolutional Neural Networks) have a good performance on most classification tasks,but they are vulnerable when meeting adversarial *** and design of highly aggressive adversarial examples can help enhance the security and robustness of *** transferability of adversarial examples is still low in black-box ***,an adversarial example method based on probability histogram equalization,namely HE-MI-FGSM(Histogram Equalization Momentum Iterative Fast Gradient Sign Method) is *** each iteration of the adversarial example generation process,the original input image is randomly histogram equalized,and then the gradient is calculated to generate adversarial perturbations to mitigate overfitting in the adversarial *** effectiveness of the method is verified on the ImageNet *** with the advanced method I-FGSM(Iterative Fast Gradient Sign Method) and MI-FGSM(Momentum I-FGSM),the attack success rate in the adversarial training network increased by 27.9% and 7.7% on average,respectively.
Adding subtle perturbations to an image can cause the classification model to misclassify, and such images are called adversarial examples. Adversarial examples threaten the safe use of deep neural networks, but when ...
Adding subtle perturbations to an image can cause the classification model to misclassify, and such images are called adversarial examples. Adversarial examples threaten the safe use of deep neural networks, but when combined with reversible data hiding(RDH) technology, they can protect images from being correctly identified by unauthorized models and recover the image lossless under authorized models. Based on this, the reversible adversarial example(RAE) is rising. However, existing RAE technology focuses on feasibility, attack success rate and image quality, but ignores transferability and time complexity. In this paper,we optimize the data hiding structure and combine data augmentation technology,which flips the input image in probability to avoid overfitting phenomenon on the dataset. On the premise of maintaining a high success rate of white-box attacks and the image's visual quality, the proposed method improves the transferability of reversible adversarial examples by approximately 16% and reduces the computational cost by approximately 43% compared to the state-of-the-art method. In addition, the appropriate flip probability can be selected for different application scenarios.
The Area Under the ROC Curve (AUC) is a well-known metric for evaluating instance-level long-tail learning problems. In the past two decades, many AUC optimization methods have been proposed to improve model performan...
This study focuses on the First VoicePrivacy Attacker Challenge within the ICASSP 2025 signalprocessing Grand Challenge, which aims to develop speaker verification systems capable of determining whether two anonymize...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
This study focuses on the First VoicePrivacy Attacker Challenge within the ICASSP 2025 signalprocessing Grand Challenge, which aims to develop speaker verification systems capable of determining whether two anonymized speech signals are from the same speaker. However, differences between feature distributions of original and anonymized speech complicate this task. To address this challenge, we propose an attacker system that combines Data Augmentation enhanced feature representation and Speaker Identity Difference enhanced classifier to improve verification performance, termed DA-SID. Specifically, data augmentation strategies (i.e., data fusion and SpecAugment) are utilized to mitigate feature distribution gaps, while probabilistic linear discriminant analysis (PLDA) is employed to further enhance speaker identity difference. Our system significantly outperforms the baseline, demonstrating exceptional effectiveness and robustness against various voice anonymization systems, ultimately securing a top-5 ranking in the challenge.
The transformational and spatial proximities are important cues for identifying inliers from an appearance based match set because correct matches generally stay close in input images and share similar local transform...
详细信息
Relation extraction is an important task in natural language processing. Existing relation extraction tasks usually use data augmentation to construct positive and negative samples for contrastive learning training. A...
Relation extraction is an important task in natural language processing. Existing relation extraction tasks usually use data augmentation to construct positive and negative samples for contrastive learning training. And previous works lack the accurate selection of positive and negative samples and the full use of relation lab.ls. To overcome these defects, for supervised relational extraction, we propose a lab.ls Contrastive Learning framework (lab.lsCL) that considers both global and local perspectives, which makes full use of lab.ls for design. Specifically, from a global perspective, we use lab.l and cosine similarity to select the most easily misclassified samples as positive and negative examples for training. From a local perspective, positive and negative samples are selected according to batch and lab.l, and this way of sample selection increases the randomness of sample selection. The contrastive learning framework lab.lsCL proposed by us makes full use of lab.ls and synthesizes two perspectives to construct contrastive learning lab.lsCL, which increases the tolerance of the framework. Finally, the experimental results demonstrate the effectiveness of our proposed framework, which is significantly improved on the Semeval-2010 Task 8 dataset compared with the baselines.
Microphone array techniques are widely used in sound source localization and smart city acoustic-based traffic monitoring, but these applications face significant challenges due to the scarcity of lab.led real-world t...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Microphone array techniques are widely used in sound source localization and smart city acoustic-based traffic monitoring, but these applications face significant challenges due to the scarcity of lab.led real-world traffic audio data and the complexity and diversity of application scenarios. The DCASE Challenge’s Task 10 focuses on using multi-channel audio signals to count vehicles (cars or commercial vehicles) and identify their directions (left-to-right or vice versa). In this paper, we propose a graph-enhanced dual-stream feature fusion network (GEDF-Net) for acoustic traffic monitoring, which simultaneously considers vehicle type and direction to improve detection. We propose a graph-enhanced dual-stream feature fusion strategy which consists of a vehicle type feature extraction (VTFE) branch, a vehicle direction feature extraction (VDFE) branch, and a frame-level feature fusion module to combine the type and direction feature for enhanced performance. A pre-trained model (PANNs) is used in the VTFE branch to mitigate data scarcity and enhance the type features, followed by a graph attention mechanism to exploit temporal relationships and highlight important audio events within these features. The frame-level fusion of direction and type features enables fine-grained feature representation, resulting in better detection performance. Experiments demonstrate the effectiveness of our proposed method. GEDF-Net is our submission that achieved 1st place in the DCASE 2024 Challenge Task 10.
暂无评论