Crowd counting is usually handled in a density map regression fashion, which is supervised via an L2 loss between the predicted density map and ground truth. To effectively regulate models, various improved L2 loss fu...
详细信息
Crowd counting is usually handled in a density map regression fashion, which is supervised via an L2 loss between the predicted density map and ground truth. To effectively regulate models, various improved L2 loss functions have been developed to find a better correspondence between predicted density and annotation positions. In this paper, we propose to predict the density map at one resolution but measure its quality via a derived log-formed loss at multiple resolutions. Unlike existing methods that assume density maps at different resolutions are independent, our loss is obtained by modeling the likelihood function inspired by the relationship of density maps across multi-resolutions. We find that the traditional single-resolution L2 loss is a particular case of our derived log-likelihood. We mathematically prove it is superior to a single-resolution L2 loss. Without bells and whistles, the proposed loss substantially improves several baselines and performs favorably compared to state-of-the-art methods on five crowd counting datasets: NWPU-Crowd, ShanghaiTech A & B, UCF-QNRF, and JHU-Crowd++. The source code and trained models are released at https://***/streamer-AP/PML_***.
High-quality X-rays are now available to diagnose lung diseases with the help of radiologists. However, the diagnostic process is time consuming and depends on specialist availability in medical institutions. Patient ...
详细信息
作者:
Shen, HaojingChen, SihongWang, RanWang, XizhaoShenzhen Univ
Big Data Inst Coll Comp Sci & Software Engn Guangdong Key Lab Intelligent Informat Proc Shenzhen 518060 Guangdong Peoples R China Shenzhen Univ
Coll Math & Stat Shenzhen 518060 Peoples R China Shenzhen Univ
Shenzhen Key Lab Adv Machine Learning & Applicat Shenzhen 518060 Peoples R China
It is necessary to improve the performance of some special classes or to particularly protect them from attacks in adversarial learning. This article proposes a framework combining cost-sensitive classification and ad...
详细信息
It is necessary to improve the performance of some special classes or to particularly protect them from attacks in adversarial learning. This article proposes a framework combining cost-sensitive classification and adversarial learning together to train a model that can distinguish between the protected and unprotected classes, such that the protected classes are less vulnerable to adversarial examples. We find in this framework an interesting phenomenon during the training of deep neural networks, called the Min-Max property, that is, the absolute values of most parameters in the convolutional layer approach 0 while the absolute values of a few parameters are significantly larger, becoming bigger. Based on this Min-Max property which is formulated and analyzed in a view of random distribution, we further build a new defense model against adversarial examples for adversarial robustness improvement. An advantage of the built model is that it performs better than the standard one and can combine with adversarial training to achieve improved performance. It is experimentally confirmed that, regarding the average accuracy of all classes, our model is almost as same as the existing models when an attack does not occur and is better than the existing models when an attack occurs. Specifically, regarding the accuracy of protected classes, the proposed model is much better than the existing models when an attack occurs.
Due to the complex functioning of Smart Healthcare Systems (SHS), many security concerns have been raised in the past. It provisions the attackers to hamper the working of SHS in a variety of ways, e.g., injection of ...
详细信息
Due to the complex functioning of Smart Healthcare Systems (SHS), many security concerns have been raised in the past. It provisions the attackers to hamper the working of SHS in a variety of ways, e.g., injection of false data to replace vital signs, tampering of medical devices to prevent informing critical situations, etc. In this work, a novel ML-based framework, i.e., SmartHealth is proposed to secure IoMT devices in SHS. SmartHealth watches the vital signs gathered through different IoMT to analyze the change in various body activities to differentiate between normal activities and dangerous security attacks. The performance of the SmartHealth is also analyzed for three different dangerous attacks. During performance analysis, it has been observed that SmartHealth can identify wicked activities in IoMT 92% times accurately with an F1-score of 90%. (c) 2023 The Authors. Published by Elsevier B.V. on behalf of The Korean Institute of Communications and Information Sciences. This is an open access article under the CC BY license (http://***/licenses/by/4.0/).
In this article, we study the problem of embedding temporal attributed networks, with the goal of which is to learn dynamic low-dimensional representations over time for temporal attributed networks. Existing temporal...
详细信息
In this article, we study the problem of embedding temporal attributed networks, with the goal of which is to learn dynamic low-dimensional representations over time for temporal attributed networks. Existing temporal network embedding methods only learn the representations for nodes, which are unable to capture the dynamic affinities between nodes and attributes. Moreover, existing co-embedding methods that learn the static embeddings of both nodes and attributes cannot be naturally utilized to obtain their dynamic embeddings for temporal attributed networks. To address these issues, we propose the dynamic co-embedding model for temporal attributed networks (DCTANs) based on the dynamic stochastic state-space framework. Our model captures the dynamics of a temporal attributed network by modeling the abstract belief states representing the condition of the nodes and attributes of current time step, and predicting the transitions between temporal abstract states of two successive time steps. Our model is able to learn embeddings for both nodes and attributes based on their belief states at each time step of the temporal attributed network, while the state transition tendency for predicting the future network can be tracked and the affinities between nodes and attributes can be preserved. Experimental results on real-world networks demonstrate that our model achieves substantial performance gains in several static and dynamic graph mining applications compared with the state-of-the-art static and dynamic models.
Textbook question answering (TQA) is a task that one should answer non-diagram and diagram questions accurately, given a large context which consists of abundant diagrams and essays. Although lots of studies have made...
详细信息
Textbook question answering (TQA) is a task that one should answer non-diagram and diagram questions accurately, given a large context which consists of abundant diagrams and essays. Although lots of studies have made significant progress in the natural image question answering (QA), they are not applicable to comprehending diagrams and reasoning over the long multimodal context. To address the above issues, we propose a relation-aware fine-grained reasoning (RAFR) network that performs fine-grained reasoning over the nodes of relation-based diagram graphs. Our method uses semantic dependencies and relative positions between nodes in the diagram to construct relation graphs and applies graph attention networks to learn diagram representations. To extract and reason over the multimodal knowledge, we first extract the text that is the most relevant to questions, options, and the instructional diagram which is the most relevant to question diagrams at the word-sentence level and the node-diagram level, respectively. Then, we apply instructional-diagram-guided attention and question-guided attention to reason over the node of question diagrams, respectively. The experimental results show that our proposed method achieves the best performance on the TQA dataset compared with baselines. We also conduct extensive ablation studies to comprehensively analyze the proposed method.
Transformer-based and interaction point-based methods have demonstrated promising performance and potential in human-object interaction detection. However, due to differences in structure and properties, direct integr...
详细信息
Transformer-based and interaction point-based methods have demonstrated promising performance and potential in human-object interaction detection. However, due to differences in structure and properties, direct integration of these two types of models is not feasible. Recent Transformer-based methods divide the decoder into two branches: an instance decoder for human-object pair detection and a classification decoder for interaction recognition. While the attention mechanism within the Transformer enhances the connection between localization and classification, this paper focuses on further improving HOI detection performance by increasing the intrinsic correlation between instance and action features. To address these challenges, this paper proposes a novel Transformer-based HOI Detection framework. In the proposed method, the decoder contains three parts: learnable query generator, instance decoder, and interaction classifier. The learnable query generator aims to build an effective query to guide the instance decoder and interaction classifier to learn more accurate instance and interaction features. These features are then applied to update the query generator for the next layer. Especially, inspired by the interaction point-based HOI and object detection methods, this paper introduces the prior bounding boxes, keypoints detection and spatial relation feature to build the novel learnable query generator. Finally, the proposed method is verified on HICO-DET and V-COCO datasets. The experimental results show that the proposed method has the better performance compared with the state-of-the-art methods.
The Internet of Things (IoT) is connecting more devices every day. Security is critical to ensure that the devices operate in a trusted environment. The lack of proper IoT security encourages cybercriminals to target ...
详细信息
The Internet of Things (IoT) is connecting more devices every day. Security is critical to ensure that the devices operate in a trusted environment. The lack of proper IoT security encourages cybercriminals to target many smart devices across the network and gain sensitive information. Distributed Denial of Service (DDoS) attacks are common in the IoT infrastructure and involve hijacking IoT devices to consume resources and interrupt services. This may specifically vandalize the application running the service that the end users are trying to access (application layer DDoS attacks) or flood the network bandwidth leading to network failure (software defined network DDoS attacks). This article proposes a hybrid attention-based bidirectional long short term memory (LSTM) with convolutional neural networks (CNN) to identify DDoS attacks in the application layer and SDN. We deploy several other machinelearning models like logistic regression, decision trees, random forests, support vector machines, K-nearest neighbors, extreme gradient boosting, artificial neural networks, CNN, LSTM, CNN-LSTM to evaluate the performance of our proposed model. The evaluation metrics considered for the study are accuracy, precision, recall, and F-1 score. The experimental analysis on multiple datasets exhibits that the proposed model performs the classification efficiently with an accuracy of 99.74% and 99.98%.
Accent conversion (AC) aims to alter the accent of spoken language while preserving the original content and speaker characteristics. While any accent can be selected as a target, foreign accent conversion (FAC) that ...
详细信息
ISBN:
(数字)9798331516826
ISBN:
(纸本)9798331516833
Accent conversion (AC) aims to alter the accent of spoken language while preserving the original content and speaker characteristics. While any accent can be selected as a target, foreign accent conversion (FAC) that focuses on L2 speakers is particularly noteworthy due to its wide-ranging applications. Compared to general voice conversion tasks, which focus on speaker conversion, research related to accent conversion is relatively scarce, and the audio quality is often limited. In this article, we introduce a diffusion decoder into the conventional TTS-guided accent conversion framework and propose a phoneme-level acoustic-linguistic alignment strategy. Subjective evaluations on the Chinese-accent source speech confirm that the proposed method outperforms the baseline in terms of speech naturalness, accentedness, and speaker similarity
1 1
Audio samples are available at ***..
Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solutions for large-scale genomic research and applications. Recent year...
详细信息
Despite the rapid development of sequencing technology, single-nucleotide polymorphism (SNP) arrays are still the most cost-effective genotyping solutions for large-scale genomic research and applications. Recent years have witnessed the rapid development of numerous genotyping platforms of different sizes and designs, but population-specific platforms are still lacking, especially for those in developing countries. SNP arrays designed for these countries should be cost-effective (small size), yet incorporate key information needed to associate genotypes with traits. A key design principle for most current platforms is to improve genome-wide imputation so that more SNPs not included in the array (imputed SNPs) can be predicted. However, current tag SNP selection methods mostly focus on imputation accuracy and coverage, but not the functional content of the array. It is those functional SNPs that are most likely associated with traits. Here, we propose LmTag, a novel method for tag SNP selection that not only improves imputation performance but also prioritizes highly functional SNP markers. We apply LmTag on a wide range of populations using both public and in-house whole-genome sequencing databases. Our results show that LmTag improved both functional marker prioritization and genome-wide imputation accuracy compared to existing methods. This novel approach could contribute to the next generation genotyping arrays that provide excellent imputation capability as well as facilitate array-based functional genetic studies. Such arrays are particularly suitable for under-represented populations in developing countries or non-model species, where little genomics data are available while investment in genome sequencing or high-density SNP arrays is limited. LmTag is available at: https://***/datngu/LmTag.
暂无评论