This research looks into whether it is possible to use SNNs along with deeplearning methods to find occlusions in virtual reality and augmented reality images. The next step goes into more detail about the basic idea...
详细信息
This research looks into whether it is possible to use SNNs along with deeplearning methods to find occlusions in virtual reality and augmented reality images. The next step goes into more detail about the basic ideas behind SNNs and the benefits they offer, such as event-driven processing and low power usage, which are both very important for real-time augmented and virtual reality systems. After that, we'll talk about our one-of-a-kind occlusion recognition system, which uses both deeplearning and SNNs. Utilizing both virtual and real-world AR and VR datasets, we conduct experiments to test how well our method works. These results show a big improvement in the accuracy of occlusion recognition compared to previous methods. We also find out how well our system works with computers and how many resources it needs. This shows that it can be used on AR and VR devices that don't have a lot of resources. In the end of this research, it is shown that spiking neural networks and deeplearning methods can make it easier to find occlusions in AR/VR pictures. Our method improves augmented and virtual reality experiences by getting rid of this major problem. This opens up new possibilities in many areas, such as education, training, simulations, and gaming.
For artificial intelligence applications in transmission electron microscopy (TEM), hardware and computational constraints often obstruct real-time data processing, inflating operational costs, consuming valuable inst...
详细信息
For artificial intelligence applications in transmission electron microscopy (TEM), hardware and computational constraints often obstruct real-time data processing, inflating operational costs, consuming valuable instrument time, and heightening the risk of damage to beam-sensitive specimens, thereby complicating reliable data interpretation. To address these issues, we propose a two-stage pruning strategy that reduces deep-learning model size and computational overhead while preserving high performance and generalization across diverse datasets. Unlike conventional pruning techniques, which typically rely solely on weight magnitude and risk overlooking critical variability and directional properties in weight vectors, our approach initially removes filters with low magnitude and insufficient variability, followed by pruning filters with high linear similarity to eliminate redundancy. This one-shot pruning process, followed by fine-tuning, minimizes accuracy loss and mitigates barriers to deeplearning integration in TEM workflows. Our method expedites TEM analysis, enabling more efficient, real-time, and cost-effective materials characterization. Additionally, this work lays a foundation for investigating the broader applicability and versatility to different architectures and tasks, particularly in resource-constrained environments where both model size and computational efficiency are critical.
With the advancement of technologies, automatic plant leaf disease detection has received considerable attention from researchers working in the area of precision agriculture. A number of deeplearning-based methods h...
详细信息
With the advancement of technologies, automatic plant leaf disease detection has received considerable attention from researchers working in the area of precision agriculture. A number of deeplearning-based methods have been introduced in the literature for automated plant disease detection. However, the majority of datasets collected from real fields have blurred background information, data imbalances, less generalization, and tiny lesion features, which may lead to over-fitting of the model. Moreover, the increased parameter size of deeplearning models is also a concern, especially for agricultural applications due to limited resources. In this paper, a novel ClGan (Crop Leaf Gan) with improved loss function has been developed with a reduced number of parameters as compared to the existing state-of-the-art methods. The generator and discriminator of the developed ClGan have been encompassed with an encoder-decoder network to avoid the vanishing gradient problem, training instability, and non-convergence failure while preserving complex intricacies during synthetic image generation with significant lesion differentiation. The proposed improved loss function introduces a dynamic correction factor that stabilizes learning while perpetuating effective weight optimization. In addition, a novel plant leaf classification method ClGanNet, has been introduced to classify plant diseases efficiently. The efficiency of the proposed ClGan was validated on the maize leaf dataset in terms of the number of parameters and FID score, and the results are compared against five other state-of-the-art GAN models namely, DC-GAN, W-GAN, WGanGP, InfoGan, and LeafGan. Moreover, the performance of the proposed classifier, ClGanNet, was evaluated with seven state-of-the-art methods against eight parameters on the original, basic augmented, and ClGan augmented datasets. Experimental results of ClGanNet have outperformed all the considered methods with 99.97% training and 99.04% testing acc
Monocular depth estimation technology has emerged as a critical component across a variety of outdoor applications like robotics, augmented reality, autonomous driving, and 3D reconstruction. Mainstream monocular dept...
详细信息
Monocular depth estimation technology has emerged as a critical component across a variety of outdoor applications like robotics, augmented reality, autonomous driving, and 3D reconstruction. Mainstream monocular depth estimation methods consistently face challenges in applications requiring real-time performances, as they exhibit considerable computational complexity, resulting in poor runtime performance. Here, we propose an innovative processing module named MDE-Lite. Based on that, we develop a lightweight yet effective depth estimation network named MBUDepthNet. Besides, we build a training scheme with multiple loss functions. Experimental validation on KITTI dataset demonstrates that our method not only rivals mainstream methods in terms of accuracy but also exhibits superior computational efficiency. Compared to the method using ResNet-18, our method achieves a 22% higher frame rate in terms of frames per second.
This research presents a novel approach for plant disease identification utilizing Convolutional Neural Networks (CNNs) and the PYNQ FPGA platform. The study leverages the parallel processing capabilities of FPGAs to ...
详细信息
This research presents a novel approach for plant disease identification utilizing Convolutional Neural Networks (CNNs) and the PYNQ FPGA platform. The study leverages the parallel processing capabilities of FPGAs to accelerate CNN inference, aiming to enhance the efficiency of plant disease detection in agricultural settings. The implementation involves optimizing the CNN architecture for deployment on the PYNQ FPGA, considering factors such as image size and learning rates. Through experimentation, the research refines hyper parameters, achieving improved accuracy and F1 scores. Visualizations using heat maps highlight the CNN's reliance on color, shape, and texture for feature extraction in disease identification. The integration of FPGA technology demonstrates promising advancements in real-time, high-performance plant disease classification, offering potential benefits for precision agriculture and crop management. This research contributes to the growing field of FPGA-accelerated deeplearning applications in agro technology, addressing challenges in plant health monitoring and fostering sustainable agricultural practices.
While deep reinforcement learning (DRL) models are effective at learning appropriate actions from highdimensional data, they require large amounts of costly and time-consuming training data to be collected in real -wo...
详细信息
While deep reinforcement learning (DRL) models are effective at learning appropriate actions from highdimensional data, they require large amounts of costly and time-consuming training data to be collected in real -world settings. For this reason, collecting data in simulations offers a promising alternative, but transferring policy networks from simulation to reality can be challenging due to differences in perception between the virtual and real worlds. This paper proposes a two-level method to bridge the simulation-toreality (sim-to-real) gap for depth images, specifically for autonomous environmental navigation that uses DRL. Simulated depth images are first translated at a perception level through generative adversarial network (GAN) to make them look like real data from a depth sensor. Simulated and GAN-generated depth images are encoded into latent representations, and the encoder is trained in the latent space to make the two images paired. This encoder is trained simultaneously with a reinforcement learning network model to extract domain-invariant and task-relevant features from depth images and map the behavioral similarity of states to the latent space. Our experimental results demonstrate that our approach can effectively bridge the sim-to-real gap, enabling policies learned in simulation to maintain their control performance in the real world.
Balancing accuracy and speed is crucial for semantic segmentation in autonomous driving. While various mechanisms have been explored to enhance segmentation accuracy in lightweight deeplearning networks, adding more ...
详细信息
Balancing accuracy and speed is crucial for semantic segmentation in autonomous driving. While various mechanisms have been explored to enhance segmentation accuracy in lightweight deeplearning networks, adding more mechanisms does not always lead to better performance and often significantly increases processingtime. This paper investigates a more effective and efficient integration of three key mechanisms - context, attention, and boundary - to improve real-time semantic segmentation of road scene images. Based on an analysis of recent fully convolutional encoder-decoder networks, we propose a novel Scale-adaptive Attention and Boundary Aware (SABA) segmentation network. SABA enhances context through a new pyramid structure with multi-scale residual learning, refines attention via scale-adaptive spatial relationships, and improves boundary delineation using progressive refinement with a dedicated loss function and learnable weights. Evaluations on the Cityscapes benchmark show that SABA outperforms current real-time semantic segmentation networks, achieving a mean intersection over union (mIoU) of up to 76.7% and improving accuracy for 17 out of 19 object classes. Moreover, it achieves this accuracy at an inference speed of up to 83.4 frames per second, significantly exceeding real-time video frame rates. The code is available at https://***/liuchunyan66/SABA.
PM2.5 in air pollution poses a significant threat to public health and the ecological environment. There is an urgent need to develop accurate PM2.5 prediction models to support decision-making and reduce risks. This ...
详细信息
PM2.5 in air pollution poses a significant threat to public health and the ecological environment. There is an urgent need to develop accurate PM2.5 prediction models to support decision-making and reduce risks. This review comprehensively explores the progress of PM2.5 concentration prediction, covering bibliometric trends, time series data characteristics, deeplearning applications, and future development directions. This article obtained data on 2327 journal articles published from 2014 to 2024 from the WOS database. Bibliometric analysis shows that research output is growing rapidly, with China and the United States playing a leading role, and recent research is increasingly focusing on data-driven methods such as deeplearning. Key data sources include ground monitoring, meteorological observations, remote sensing, and socioeconomic activity data. deeplearning models (including CNN, RNN, LSTM, and Transformer) perform well in capturing complex temporal dependencies. With its self-attention mechanism and parallel processing capabilities, Transformer is particularly outstanding in addressing the challenges of long sequence modeling. Despite these advances, challenges such as data integration, model interpretability, and computational cost remain. Emerging technologies such as meta-learning, graph neural networks, and multi-scale modeling offer promising solutions while integrating prediction models into real-world applications such as smart city systems can enhance practical impact. This review provides an informative guide for researchers and novices, providing an understanding of cutting-edge methods, practical applications, and systematic learning paths. It aims to promote the development of robust and efficient prediction models to contribute to global air pollution management and public health protection efforts.
deep neural networks (DNNs) have shown remarkable performance in solving a wide variety of real-world problems, ranging from image recognition to natural language processing and self-driving vehicles. In principle, th...
详细信息
deep neural networks (DNNs) have shown remarkable performance in solving a wide variety of real-world problems, ranging from image recognition to natural language processing and self-driving vehicles. In principle, the achievements of DNNs are mainly contributed by their deep architectures, which can learn meaningful representations at different levels. This can greatly enhance the performance of the subsequent machine-learning algorithms. However, manually designing an optimal deep architecture for a particular problem requires a rich knowledge of both the investigated problem domain and the DNNs, which is not necessarily held by every end user interested in this area.
The future direction of global automotive development is electrification, and the battery current collector (BCC) is an essential component of new energy vehicle batteries. However, the welding defects in the BCC duri...
详细信息
The future direction of global automotive development is electrification, and the battery current collector (BCC) is an essential component of new energy vehicle batteries. However, the welding defects in the BCC during the welding process are characterized by a disorganized distribution, extensive size variations, multiple types, and ambiguous features, posing challenges for detecting welding defects in the current collector. This article proposes a lightweight deep-learning algorithm called MGNet for detecting welding defects in the current collectors. We introduce a lightweight MDM module based on multiscale channels, which utilizes deep dynamic convolutions as its basic structure to extract compelling features while reducing computational complexity. We also propose a lightweight feature fusion network called GS_GFPN, which fully leverages the semantic information of the backbone network feature maps while reducing parameter redundancy and maintaining detection accuracy. Experimental evaluations on both the BCC surface defect database and the publicly available Northeastern University (NEU) surface defect database demonstrate that MGNet outperforms existing methods with significant improvements in detection accuracy. The mean average precision (mAP) at IoU threshold 0.5 is 93.9%-78.0%, respectively, with frames per second (FPS) of 212.8 and 238.1 and a model weight of only 3.1 M. Moreover, the algorithm is successfully deployed on the NVIDIA Jetson Nano embedded device, enabling real-time defect detection for practical industrial applications.
暂无评论