Recent times have witnessed significant progress in deeplearning-based finger vein pattern extraction methods, but two unavoidable issues still remain to be addressed. One is that the model trained on a single finger...
详细信息
Recent times have witnessed significant progress in deeplearning-based finger vein pattern extraction methods, but two unavoidable issues still remain to be addressed. One is that the model trained on a single finger vein dataset shows poor generalizability, and the model performance is limited by the image quality of the single dataset;the other is that it is hard for the deep model to extract real-time finger vein patterns because of its large number of parameters and poor real-time performance. To address the aforementioned issues, we propose a novel lightweight domain-adaptive segmentation framework (Lite-HDNet) that learns a generic representation of different domains to improve the extraction of finger vein patterns. We propose a multi-domain feature knowledge transfer strategy and a domain migration loss converter to enable the trunk network to learn the robust representations of different finger vein datasets as well as to compensate for the heterogeneity between them. In the proposed framework, two lightweight segmentation networks are designed as the trunk branch and the auxiliary branch to achieve real-time extraction of finger vein patterns. Our approach has been extensively tested on four finger vein datasets available to the public, and the results show that our Lite-HDNet not only improves segmentation performance on all datasets but also effectively reduces heterogeneity between different domains. In addition, we also validated the real-time performance of Lite-HDNet on NVIDIA embedded terminals, proving the outperformance of our approach compared with previous lightweight segmentation networks.
In the field of traffic management and control systems, we are witnessing a symbiotic evolution, where intelligent infrastructure is progressively collaborating with smart vehicles to produce benefits for traffic moni...
详细信息
In the field of traffic management and control systems, we are witnessing a symbiotic evolution, where intelligent infrastructure is progressively collaborating with smart vehicles to produce benefits for traffic monitoring and security, by rapidly identifying hazardous behaviours. This exponential growth is due to the rapid development of deeplearning in recent years, as well as the improvements in computer vision models. These technologies allow for monitoring tasks without the need to install numerous sensors or stop the traffic, using the extensive camera network of surveillance cameras already present in worldwide roads. This study proposes a computer vision-based solution that allows for real-timeprocessing of video streams through edge computing devices, eliminating the need for Internet connectivity or dedicated sensors. The proposed system employs deeplearning algorithms and vision techniques that perform vehicle detection, classification, tracking, speed estimation, and vehicle geolocation.
The objective of this study is to develop a systematic and novel workflow for the automated and objective characterization of carbonate reservoirs with the help of deeplearning architectures. An image database of mor...
详细信息
The objective of this study is to develop a systematic and novel workflow for the automated and objective characterization of carbonate reservoirs with the help of deeplearning architectures. An image database of more than 6,000 carbonate thin- section images was generated using the optical microscope and image augmentation techniques. Five features, namely clay/silt/mineral, calcite, pores, fossils, and opaque minerals, were identified with the help of manual petrography of the thin sections under the microscope. A total of four deeplearning models were developed, which included U- Net, U- Net with ResNet34 backbone, U- Net with Mobilenetv2 backbone, and LinkNet with ResNet34 backbone. The Ensemble model of U- Net + ResNet34 and U- Net + MobileNetv2 yielded the highest intersection over union (IoU) score of 75%, followed by the U- Net + ResNet34 model with an IoU score of 61%. The models struggled with class imbalance, which was very prominent in the image database, with classes such as fossils and opaques considered to be rare. The statistical analysis of the relative errors revealed that the major classes play a more important role in increasing the final IoU score as opposed to the common understanding that the rare classes affect the model performance. The novel workflow developed in this paper can be extended to real carbonate reservoirs for time efficient, objective, and accurate characterization.
Smart education environments combine technologies such as big data, cloud computing, and artificial intelligence to optimize and personalize the teaching and learning process, thereby improving the efficiency and qual...
详细信息
Smart education environments combine technologies such as big data, cloud computing, and artificial intelligence to optimize and personalize the teaching and learning process, thereby improving the efficiency and quality of education. This article proposes a dual-stream-coded image sentiment analysis method based on both facial expressions and background actions to monitor and analyze learners' behaviors in realtime. By integrating human facial expressions and scene backgrounds, the method can effectively address the occlusion problem in uncontrolled environments. To enhance the accuracy and efficiency of emotion recognition, a multi-task convolutional network is employed for face extraction, while 3D convolutional neural networks optimize the extraction process of facial features. Additionally, the adaptive learning screen adjustment system proposed in this article dynamically adjusts the presentation of learning content to optimize the learning environment and enhance learning efficiency by monitoring learners' expressions and reactions in realtime. By analyzing the experimental results on the Emotic dataset, the emotion recognition model in this article shows high accuracy, especially in the recognition of specific emotion categories. This research significantly contributes to the field of smart education environments by providing an effective solution for real-time emotion recognition.
With the rise of artificial intelligence, deeplearning techniques are increas-ingly being used in real-life applications, especially in imageprocessing. People have started to use imageprocessing techniques based o...
详细信息
CAD plays an important role in current product form recognition. How to accurately identify the product form and improve the design efficiency has become an urgent demand for garment CAD design. This article aims to e...
详细信息
Tiny machine learning (TML) is a new research area whose goal is to design machine and deeplearning (DL) techniques able to operate in embedded systems and the Internet-of-Things (IoT) units, hence satisfying the sev...
详细信息
Tiny machine learning (TML) is a new research area whose goal is to design machine and deeplearning (DL) techniques able to operate in embedded systems and the Internet-of-Things (IoT) units, hence satisfying the severe technological constraints on memory, computation, and energy characterizing these pervasive devices. Interestingly, the related literature mainly focused on reducing the computational and memory demand of the inference phase of machine and deeplearning models. At the same time, the training is typically assumed to be carried out in cloud or edge computing systems (due to the larger memory and computational requirements). This assumption results in TML solutions that might become obsolete when the process generating the data is affected by concept drift (e.g., due to periodicity or seasonality effect, faults or malfunctioning affecting sensors or actuators, or changes in the users' behavior), a common situation in real-world application scenarios. For the first time in the literature, this article introduces a TML for concept drift (TML-CD) solution based on deeplearning feature extractors and a k-nearest neighbors (k-NNs) classifier integrating a hybrid adaptation module able to deal with concept drift affecting the data-generating process. This adaptation module continuously updates (in a passive way) the knowledge base of TML-CD and, at the same time, employs a change detection test (CDT) to inspect for changes (in an active way) to quickly adapt to concept drift by removing obsolete knowledge. Experimental results on both image and audio benchmarks show the effectiveness of the proposed solution, whilst the porting of TML-CD on three off-the-shelf micro-controller units (MCUs) shows the feasibility of what is proposed in real-world pervasive systems.
The advent of deeplearning has revolutionized computer vision, enabling real-time analysis crucial for traffic management and vehicle identification. This research introduces a system combining vehicle make and model...
详细信息
The advent of deeplearning has revolutionized computer vision, enabling real-time analysis crucial for traffic management and vehicle identification. This research introduces a system combining vehicle make and model detection with Automatic Number Plate Recognition (ANPR), achieving a groundbreaking 97.5% accuracy rate. Unlike traditional methods, which focus on either make and model detection or ANPR independently, this study integrates both aspects into a single, cohesive system, providing a more holistic and efficient solution for vehicle identification, ensuring robust performance even in adverse weather conditions. The paper explores the use of deeplearning techniques, including OpenCV, in combination with Python programming language. Leveraging MobileNet-V2 and YOLOx (You Only Look Once) for vehicle identification, and YOLOv4-tiny, Paddle OCR (optical character recognition), and SVTR-tiny for ANPR, the system was rigorously tested at Firat University's entrance with a thousand images captured under various conditions such as fog, rain, and low light. The system's exceptional success rate in these tests highlights its robustness and practical applicability. Additionally, experiments evaluate the system's accuracy and effectiveness, using Gradient-weighted Class Activation Mapping (GradCam) technology to gain insights into neural networks' decision-making processes and identify areas for improvement, particularly in misclassifications. The implications of this research for computer vision are significant, paving the way for advanced applications in autonomous driving, traffic management, stolen vehicles, and security surveillance. Achieving real-time, high-accuracy vehicle identification, the integrated Vehicle Make and Model Recognition (VMM R) and ANPR system sets a new standard for future research in the field.
image dehazing is said to be an emerging research area in the platform of computer vision and imageprocessing. Due to the cruel fog, air dispersion, and haze around the environment, the hazes images are resulted in d...
详细信息
image dehazing is said to be an emerging research area in the platform of computer vision and imageprocessing. Due to the cruel fog, air dispersion, and haze around the environment, the hazes images are resulted in different challenges in retrieving the actual information of the original image. On the other hand, the conventional approaches are ensured with the huge computational complexity and also with the distortion of actual images like over-saturation and halos. The recent methods are used for restoring the haze-free images however they are worked with the physical models and along with the learning methods. It is a very challenging task to maintain the detailed details of the image at the time of reducing the fog in the single-image dehazing. With an advanced development deep structured strategy, mostly Convolutional Neural Network (CNN)-aided dehazing approaches are developed for processing the single image dehazing. However, haze residual and slow training of the convergence rate are considered as the two main drawbacks in these conventional dehazing networks. To deal with these problems, the latest approach is proposed for the restoration of haze-free images. The hazy images are gathered from the standard datasets. At first, Adaptive Discrete Wavelet Transform (ADWT) is utilized for decomposing the images, where the ADWT is implemented by Hybrid African Vultures Fire Fly Optimization (HAVFFO). Further, image dehazing is designed by Optimized Residual-Based deep CNN (OR-deep CNN), where the hyperparameters of the Residual-Based deep CNN are optimized by the same HAVFFO. Finally, the restoration of haze-free images is carried out through adaptive inverse DWT. Through the performance analysis, our recommended model is better in quantitative visual and performances on online resources.
This paper proposes a novel logo image recognition approach incorporating a localization technique based on reinforcement learning. Logo recognition is an image classification task identifying a brand in an image. As ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
This paper proposes a novel logo image recognition approach incorporating a localization technique based on reinforcement learning. Logo recognition is an image classification task identifying a brand in an image. As the size and position of a logo vary widely from image to image, it is necessary to determine its position for accurate recognition. However, because there is no annotation for the position coordinates, it is impossible to train and infer the location of the logo in the image. Therefore, we propose a deep reinforcement learning localization method for logo recognition (RL-LOGO). It utilizes deep reinforcement learning to identify a logo region in images without annotations of the positions, thereby improving classification accuracy. We demonstrated a significant improvement in accuracy compared with existing methods in several published benchmarks. Specifically, we achieved an 18-point accuracy improvement over competitive methods on the complex dataset Logo-2K+. This demonstrates that the proposed method is a promising approach to logo recognition in real-world applications.
暂无评论