Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and...
详细信息
Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and uneven object distributions. This paper introduces a novel deep learning-based artificial intelligence framework that integrates the Multiscale Self-Attention Guidance and Feature Fusion Network with the You Only Look Once model, tailored explicitly for artificial intelligence-driven unmanned aerial vehicle-based infrared thermal image analysis. The proposed methodology offers four key advancements in the You Only Look Once architecture to enhance object detection performance. First, the Multi-Head Self-Attention Transformer module combines global and local information, enabling precise object localization while mitigating the influence of complex backgrounds. Second, the Multiscale parallel Sampling Feature Fusion module optimizes the fusion of multiscale features. Third, fine-grained shallow feature maps are integrated into the fusion process to detect densely packed small objects accurately. Lastly, the Inverse-Residual Feature Enhancement module, positioned before the detection head, enhances feature extraction for small objects. Experimental evaluations on the High Altitude Infrared Thermal Unmanned Aerial Vehicle dataset demonstrate significant improvements, achieving a Mean Average Precision of 95.1%, Recall of 92.0%, and F1-Score of 91.0%. The framework's robustness is further validated on the Wildland-fire Infrared Thermal Unmanned Aerial System dataset, achieving a Mean Average Precision of 82.1%, Recall of 88.0%, and F1-Score of 82.0%. Comparative analyses with state-of-the-art methods confirm its superiority and offer a scalable artificial intelligence-driven solution for unmanned aerial vehicle applications, advancing object detection capabilities in critical scenarios.
Data exploration is increasingly relevant to the average person in our data-driven world, as data is now often open source and available to the general public and other non-expert users via open data portals and other...
详细信息
Data exploration is increasingly relevant to the average person in our data-driven world, as data is now often open source and available to the general public and other non-expert users via open data portals and other similar data sources. This has introduced the need for data exploration tools, methods and techniques to engage non-expert users in data exploration, and thus a proliferation of new research in the field of Human Computer Interaction (HCI) that relates to engaging non-expert audiences with data. In particular data exploration that contains a data visualization component can be useful for making data understandable and engaging for non-expert audiences. Currently, the range of design practices most commonly used in the field of HCI to engage non-expert audiences in data exploration that includes a visualization component has yet to be formalized or given a comprehensive overview. This paper is a systematic mapping study (SMS) which aims to fill that gap by analyzing design trends engaging non-expert audiences in visualization driven data exploration via interactive systems, providing an overview of existing design practices and engagement methods, as well as set of three recommendations for how future designers can best engage non-expert audiences in visualization driven data exploration.
The elasticity under varying temperatures and pressures is particularly significant for understanding mechanical properties and structural phase transitions. Consequently, there is an increasing demand for tools capab...
详细信息
With the rise of multi-modal large language models, accurately extracting and understanding textual information from video content-referred to as video-based optical character recognition (Video OCR)-has become a cruc...
详细信息
Split learning is a neural network training approach that can overcome the limitations of traditional deep neural networks in edge artificial intelligence environments. It offers the advantage of privacy protection be...
详细信息
Split learning is a neural network training approach that can overcome the limitations of traditional deep neural networks in edge artificial intelligence environments. It offers the advantage of privacy protection because it transmits intermediate features that are calculated via the client-side model and the client does not need to send the original input data to the server. However, concerns remain regarding data privacy leakage because an attacker can still attempt model inversion attacks based on the intermediate features. We introduce several shortcomings of existing defense techniques for such attacks and present a new defense approach called TrapMI. The proposed method can induce an attacker to generate a class-specific target image that appears different from the original image when inverting the input image. We analyze the performance through quantitative and qualitative evaluations. Furthermore, the AutoGenerator is proposed to overcome the problem whereby the client cannot perform modulation that requires the target image because the class of the input image is unknown during this phase. De-identified images are automatically modulated in the inference phase using this approach. The proposed method was evaluated on two datasets, three classification models, and three split points. Its resistance was measured using a deeper and stronger inverse model than those in previous studies. Overall, the proposed method ensures data privacy protection at a significantly higher level while maintaining a similar task performance to that of existing defense technologies.
Background: Drug-target binding affinity (DTA) prediction can accelerate the drug screening process, and deep learning techniques have been used in all facets of drug research. Affinity prediction based on deep learni...
详细信息
Large language models signify a pivotal advancement in general artificial intelligence, exhibiting capabilities that exceed human performance in diverse tasks. Nevertheless, these models often lack expertise in specia...
详细信息
The permissioned blockchain is one of the core technologies for Web3.0. However, the transactional relationship leakage on blockchain has become a critical threat to the benefits of users. To prevent the malicious ana...
详细信息
The permissioned blockchain is one of the core technologies for Web3.0. However, the transactional relationship leakage on blockchain has become a critical threat to the benefits of users. To prevent the malicious analysis of the sending and receiving addresses of series of transactions, much effort has recently been put into transactional relationship protection (TRP) in blockchain by academia and industry. However, most of the current TRP methods are designed for the particular fungible cryptocurrencies, which have limitations in terms of asset types and scenarios. This paper proposes a TRP-enabled permissioned blockchain framework. First, the framework introduces a ledger structure comprising two distinct types of blocks. The basic block publicly contains the verifiable structure of the transactions, while the transaction block privately contains their content in selected committee. Second, to prevent the committee from analysing the relationships in transaction blocks, the framework includes a confidential transaction replication mechanism that splits the related transactions and replicates them to different committees. Furthermore, we optimize the framework via quantitative analysis to minimize the required replicating size of per transaction, thus enabling the framework to achieve enhanced privacy and scalability. Theoretical analysis and experimental results on datasets demonstrate that the framework achieves more than 95% probabilities of hiding the relationships, and maintains 10 times the throughput compared to the blockchain without our method.
Video object segmentation aims to extract 2D object masks by segmenting video frames into multiple objects, which is crucial in various practical applications such as medical imaging, etc.. However, traditional video ...
详细信息
The software testing process is an essential phase in the software life cycle. However, the vast size of code and the ambiguity present in defect reports often pose challenges in identifying defects within the source ...
详细信息
暂无评论