Deep learning (DL)-based systems have emerged as powerful methods for the diagnosis and treatment of plant stress, offering high accuracy and efficiency in analyzing imagery data. This review paper aims to present a t...
详细信息
Deep learning (DL)-based systems have emerged as powerful methods for the diagnosis and treatment of plant stress, offering high accuracy and efficiency in analyzing imagery data. This review paper aims to present a thorough overview of the state-of-the-art DL technologies for plant stress detection. For this purpose, a systematic literature review was conducted to identify relevant articles for highlighting the technologies and approaches currently employed in the development of a DL-based plant stress detection system, specifically the advancement of image-based data collection systems, image preprocessing techniques, and deep learning algorithms and their applications in plant stress classification, disease detection, and segmentation tasks. Additionally, this review emphasizes the challenges and future directions in collecting and preprocessingimage data, model development, and deployment in real-world agricultural settings. Some of the key findings from this review paper are: Training data: (i) Most plant stress detection models have been trained on Red Green Blue (RGB) images;(ii) Data augmentation can increase both the quantity and variation of training data;(iii) Handling multimodal inputs (e. g., image, temperature, humidity) allows the model to leverage information from diverse sources, which can improve prediction accuracy;Model Design and Efficiency: (i) Self-supervised learning (SSL) and Few-shot learning (FSL)-based methods may be better than transfer learning (TL)-based models for classifying plant stress when the number of labeled training images are scarce;(ii) Custom designed DL architectures for a specific stress and plant type can have better performance than the state-of-the-art DL architectures in terms of efficiency, overfitting, and accuracy;(iii) The multi-task learning DL structure reuses most of the network architecture while performing multiple tasks (e.g., estimate stress type and severity) simultaneously, which makes the learning much
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current tr...
详细信息
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current trend towards using more and more data, our aim is not to fit the motion capture markers with a parameterized (blendshape) model or to smoothly interpolate a surface through the marker positions, but rather to find an instance in the high resolution dataset that contains local geometry to fit each marker. Just as is true for typical machine learning applications, this approach benefits from a plethora of data, and thus we also consider augmenting the dataset via specially designed physical simulations that target the high resolution dataset such that the simulation output lies on the same so-called manifold as the data targeted.
At present, the application of industrial robots combined with visual systems to achieve dynamic grasping materials of the belt is becoming increasingly widespread. Unlike the behavior of industrial robots grabbing af...
详细信息
At present, the application of industrial robots combined with visual systems to achieve dynamic grasping materials of the belt is becoming increasingly widespread. Unlike the behavior of industrial robots grabbing after the belt stops, industrial robots dynamically tracking and grabbing materials on the belt can greatly improve production efficiency. For vision, processing distorted material images during high-speed movement and to obtain accurate coordinate points of materials is a key task to improve the accuracy of the belt tracking applications with industrial robot. Developing imageprocessing algorithms based on MATLAB can, on the one hand, utilize existing software and hardware interface functions to improve development efficiency;On the other hand, autonomous and controllable imageprocessing algorithms can be developed based on application requirements to maximize system accuracy.
In this study, we investigate the Deep image Prior (DIP) in enhancing image smoothing, a crucial component in numerous computer vision and graphics applications. Although deep learning has demonstrated remarkable achi...
详细信息
ISBN:
(纸本)9798350351439;9798350351422
In this study, we investigate the Deep image Prior (DIP) in enhancing image smoothing, a crucial component in numerous computer vision and graphics applications. Although deep learning has demonstrated remarkable achievements in these domains, it often falls short in flexibility and controllability, in contrast to traditional methods, which are more adaptable and typically exhibit subpar performance. Notably, some end-to-end deep learning models offer control over edge preservation, yet their performance remains marginally suboptimal. To address this shortcoming, we introduce an innovative network architecture that diverges from the traditional U-Net model, featuring a Laplacian pyramid as the encoder and a deep decoder as the decoding component, integrated with a bilateral filter loss to improve DIP. This design aids the network in rapidly assimilating essential low-frequency information. Our approach excels in retaining texture details, significantly improving image smoothing and related tasks beyond the capabilities of standard DIP methods. Moreover, our technique outperforms the leading unsupervised method, pyramid texture filtering, in texture filtering tasks and other applications.
vision Transformers (ViTs) have shown impressive performance in computer vision, but their high computational cost, quadratic in the number of tokens, limits their adoption in computation-constrained applications. How...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
vision Transformers (ViTs) have shown impressive performance in computer vision, but their high computational cost, quadratic in the number of tokens, limits their adoption in computation-constrained applications. However, this large number of tokens may not be necessary, as not all tokens are equally important. In this paper, we investigate token pruning to accelerate inference for object detection and instance segmentation, extending prior works from image classification. Through extensive experiments, we offer four insights for dense tasks: (i) tokens should not be completely pruned and discarded, but rather preserved in the feature maps for later use. (ii) reactivating previously pruned tokens can further enhance model performance. (iii) a dynamic pruning rate based on images is better than a fixed pruning rate. (iv) a lightweight, 2-layer MLP can effectively prune tokens, achieving accuracy comparable with complex gating networks with a simpler design. We assess the effects of these design decisions on the COCO dataset and introduce an approach that incorporates these findings, showing a reduction in performance decline from similar to 1.5 mAP to similar to 0.3 mAP in both boxes and masks, compared to existing token pruning methods. In relation to the dense counterpart that utilizes all tokens, our method realizes an increase in inference speed, achieving up to 34% faster performance for the entire network and 46% for the backbone. Code: https://***/uzh-rpg/svit/
image captured under poor-illumination conditions often display attributes of having poor contrasts, low brightness, a narrow gray range, colour distortions and considerable interference, which seriously affect the qu...
详细信息
image captured under poor-illumination conditions often display attributes of having poor contrasts, low brightness, a narrow gray range, colour distortions and considerable interference, which seriously affect the qualitative visual effects on human eyes and severely restrict the efficiency of several machinevision systems. In addition, underwater images often suffer from colour shift and contrast degradation because of an absorption and scattering of light while travelling in water. These unpleasant effects limits visibility, reduce contrast and even generate colour casts that limits the use of underwater images and videos in marine archaeology and biology. In medical imaging applications, medical images are important tools for detecting and diagnosing several medical conditions and ailments. However, the quality of medical images can often be degraded during image acquisition due to factors such as noise interference, artefacts, and poor illumination. This may lead to the misdiagnosis of medical conditions, which can further aggravate life threatening situations. image enhancement is one of the most important technologies in the field of imageprocessing, and its purpose is to improve the quality of images for specific applications. In general, the basic principle of image enhancement is to improve the quality and visual interpretability of an image so that it is more suitable for the specific applications and the observers. Over the last few decades, numerous image enhancement techniques have been proposed in the literature This study covers a systematic survey on existing state-of-the-art image enhancement techniques into broad classification of their algorithms. In addition, this paper summarises the datasets utilised in the literature for performing the experiments. Furthermore, an attention has been drawn towards several evaluation parameters for quantitative evaluation and compared different state-of-the-art algorithms for performance analysis on benchmark
Bank Cheques are used mainly for financial transactions due to which they are processed in enormous amounts on daily basis around the globe. Often, Cheque execution time and expenses can be saved if the whole method o...
详细信息
Bank Cheques are used mainly for financial transactions due to which they are processed in enormous amounts on daily basis around the globe. Often, Cheque execution time and expenses can be saved if the whole method of recognition and verification of the Cheque becomes automatic. Automatic bank Cheque processing system is an emerging research field in the area of computer vision, imageprocessing, pattern recognition, machine learning, and deep learning. The article emphasizes the stages of the proceedings of image acquisition, pre-processing, and extraction and recognition in the automatic bank Cheque processing system. This paper describes the various steps involved in the system of automatic data extraction. It further classifies and examines existing challenges in different stages of automated processing of bank Cheques. An attempt is made in this paper to present state-of-the-art techniques for the automatic processing of bank Cheque images. The categories and sub-categories of various fields related to bank Cheque images are illustrated, benchmark datasets are enumerated, and the performance of the most representative approaches is compared. Moreover, it also contains some information about the products available in the market for automatic Cheque processing. This review provides a fundamental comparison and analysis of the remaining problems in the field. It is found that multilayer feed-forward neural network gave an accuracy of 97.31% for payee's name recognition systems;HMM-MLP gave an accuracy of 95.5% for date recognition system. In the courtesy and legal amount system, DNN gave an accuracy of 98.5% for digit recognition, MLP gave an accuracy of 93.2% for courtesy amount, MQDF gave an accuracy of 97.04% for the legal amount. Further, the SVM classifier gave an accuracy of 99.13% for signature recognition, and deep learning-based Convolutional Neural Networks (CNN) gave an accuracy of 99.14% for handwritten numeric character recognition. This survey paper
We propose a complex-amplitude diffractive processor based on diffractive deep neural networks (D2NNs). By precisely controlling the propagation of an optical field, it can effectively remove the motion blur in numera...
详细信息
We propose a complex-amplitude diffractive processor based on diffractive deep neural networks (D2NNs). By precisely controlling the propagation of an optical field, it can effectively remove the motion blur in numeral images and realize the restoration. Comparative analysis of phase-only, amplitude-only, and complex-amplitude diffractive processor reveals that the complex-amplitude network significantly enhances the performance of the processor and improves the peak signal-to-noise ratio (PSNR) of the images. Appropriate use of complex-amplitude networks contributes to reduce the number of network layers and alleviates alignment difficulties. Due to its fast processing speed and low power consumption, complex-amplitude diffractive processors hold potential applications in various fields including road monitoring, sports photography, satellite imaging, and medical diagnostics. (c) 2024 Optica Publishing Group. All rights, including for text and data mining (TDM), Artificial Intelligence (AI) training, and similar technologies, are reserved.
In the field of computer vision, the task of facial super-resolution (FSR) is crucial for applications such as surveillance and photo restoration. However, factors such as noise and artifacts in real-world scenarios s...
详细信息
image classification is one of the main parts of computer vision, which is important in applications like self-driving automotives/vehicle systems. While working with image/video data it needs huge amount of resources...
详细信息
暂无评论