In the world, several sign languages (SL) are used, and BSL (Baby Sign Language) is the process of communication between the parents and baby using gestures. Communication by gestures is a non-verbal process that util...
详细信息
In the world, several sign languages (SL) are used, and BSL (Baby Sign Language) is the process of communication between the parents and baby using gestures. Communication by gestures is a non-verbal process that utilizes motion to pass on realities, expressions and feelings to people. SL is the communication mode in which the information is conveyed via movement of body parts like cheeks, eyebrows and head. Even though many research works based on SL are available, research in BSL remains a challenge. Hence, this paper presents an optimization-based automated recognition of the deep BSL system, which determines the gesture signalled by the kids. Initially, the image frames are extracted from the videos and data augmentation processes are performed. After pre-processing, the features are extracted from the frames using the Enhanced Convolution Neural Network (ECNN). The optimal characteristics are then selected by a new Life Choice Based Optimizer (LCBO). Finally, the classification is carried out by the Deep Long Short-Term Memory (DLSTM) scheme. The implementation is performed on the Python platform, and the performances are evaluated using several performance metrics such as accuracy, precision, kappa, f1-score and recall. The performance of the proposed approach (ECNN-DLSTM) is compared with several deep and machine learning approaches and obtains an accuracy of 99% and a kappa of 96%.
vision systems play a pivotal role in the digitalization of manufacturing processes. They offer various benefits, such as quality control, process monitoring, and digitizing analog data. Developing vision systems can ...
详细信息
作者:
Ngo, Ha Quang ThinhBui, The Tri
268 Ly Thuong Kiet Street District 10 Ho Chi Minh City Viet Nam
Linh Trung Ward Thu Duc City Ho Chi Minh City Viet Nam
Applying imageprocessing to electromechanical systems is a problem of interest to scientists, in order to serve humans in many fields. To do that, there needs to be a connection between imageprocessing and mechanica...
详细信息
Pigment epithelial detachment(PED) is a disorder in retina that happens when RPE layers of cells at the back side of the eye come apart, or get teared. The bend of layers in the retina, as well as fluid, proteins, tis...
详细信息
ISBN:
(纸本)9781665495158
Pigment epithelial detachment(PED) is a disorder in retina that happens when RPE layers of cells at the back side of the eye come apart, or get teared. The bend of layers in the retina, as well as fluid, proteins, tissue, or blood vessels, is a defining feature of PED disease, which occurs most frequently in the macula. PED can disturb the vision of the people which is often depict dark shadow, blurry vision or partial loss of vision. The optical coherence tomography (OCT) is a trend set of high resolution and non-invasive imaging modality that expedite the structure of the retina. OCT non-invasively yields cross-sectional volume of images with tissues. The major objective of this research paper is to study, state of art and to classify the retinal layer segmentation techniques, PED fluid segmentation and classification of diseases in retinal OCT images. The medical industry is suffering with more critical patients and the cases are increasing in eye diseases double the number as of now. The artificial intelligence (AI) techniques help the health sector with a great and accurate automatic detection of disease. The image classification and pattern recognition are transforming the industry with artificial intelligence techniques. Many studies are being conducted employing imageprocessing to aid in the early diagnosis of this disease. imageprocessing techniques have advanced as a result of the introduction of artificial intelligence and machine learning. In this review paper, the structure classification methods and the image segmentation method that are best available existing research is discussed. This review summarizes all the recent algorithms that suits for the application of machine learning algorithms for predicting retinal diseases in OCT images. The algorithms discussed from existing research paper, produce the readers to identify the best accurate algorithm for retinal classification of infected eye and normal eye, precision and less processing time for la
image segmentation plays an important role in computer vision technology and agriculture is one of their applications. The crop images present near the vicinity are complex and dense. Hence, multilevel thresholding of...
详细信息
Light Detection and Ranging (LiDAR) is becoming a critical requirement for future computer visionapplications, such as AR/vR (iPhone-LiDAR) and ADAS (Automotive-LiDAR). A depth point-cloud input has different charact...
详细信息
Light Detection and Ranging (LiDAR) is becoming a critical requirement for future computer visionapplications, such as AR/vR (iPhone-LiDAR) and ADAS (Automotive-LiDAR). A depth point-cloud input has different characteristics than a conventional RGB image input, such that the CNN depth-inference implementation is unique when compared with a standard super-resolution CNN(SR-CNN). In this brief, we present a heterogeneous AI-accelerator SoC, which is specific to depth image completion computation. Three key innovations are introduced to improve SoC's performance. First, to accommodate the unique input data structure of a depth input, a fully-filled dataflow management engine is proposed to pre-process the RGB+Depth input, significantly improving processing element utilization (PEU). Second, to improve the efficiency of the instruction configurations of the CNN accelerator, a hardware-tiling co-processor is proposed that performs the tiling strategy of the CNN accelerator, assigning each sub-job to the PE array directly, therefore reducing the time for task assignments. Third, due to the large number of vector operations required for the post-process in the neural network, a RISC-v core is incorporated to execute vector computations better. The SoC is implemented in 40nm CMOS process, achieving 2TOPs/W energy efficiency with 34fps throughput under vGA-resolution output for real-time LiDAR systems.
Plant diseases recognition large crop losses and have negative economic effects, which makes them a serious danger to the world's food security. Early and accurate disease diagnosis is essential for efficient dise...
详细信息
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTv) cameras. This study aims to evaluate industrial cameras within a s...
In machine/computer vision, cameras serve a major role in image acquisition. Surveillance scenarios typically rely on Closed-Circuit Television (CCTv) cameras. This study aims to evaluate industrial cameras within a surveillance application, contrasting their performance with that of CCTv cameras. We explore the comparative analysis of CCTv and industrial cameras for vehicle attribute recognition, specifically concentrating on the recognition of vehicle color and model using deep learning techniques. To train and evaluate the models, we have created datasets from images captured by both a CCTv and an industrial camera. Our findings indicate that the industrial camera outperforms the CCTv. However, employing advanced processing algorithms has the potential to minimize the performance gap between these two cameras. Our research represents one of the initial comparative analyses between these camera types, offering valuable guidance in selecting the most suitable camera for specific applications.
With the recent advancements in deep learning techniques, the application areas of unstructured data analytics are emerging in multiple domains. One of the popular applications is analyzing image unstructured data. Mu...
详细信息
Deep learning has witnessed pervasive deployment on edge devices over the past decade, especially for computer visionapplications. However, its vulnerability to adversarial attacks, where visually imperceptible patte...
详细信息
ISBN:
(纸本)9798350368062
Deep learning has witnessed pervasive deployment on edge devices over the past decade, especially for computer visionapplications. However, its vulnerability to adversarial attacks, where visually imperceptible patterns cause machine learning models to malfunction, has raised significant security concerns. The connection between the image sensor and the application processors is typically not encrypted nor signed for data integrity verification, leaving the data link exposed to tampering threats. Previous works have demonstrated how these threats can be exploited. However, these methods typically inject attack patterns into the RAW image data without considering the effect of the image signal processing (ISP) pipeline, which can undesirably weaken the adversarial effects. Insofar such attacks have not succeeded in more powerful targeted misclassification fraud attacks where a selected target can be misclassified into the attacker's intended output. In this work, we propose a novel RAW image domain black-box attack that incorporates a differentiable ISP to train a knowledge-distilled substitute classifier to generate adaptive adversarial perturbations that survive the ISP. We show that such an attack is feasible by attacking the edge implementations of ResNet18 and MobileNetv2 with adversarial examples generated from their knowledge-distilled models by applying differentiable ISP on RAW formatted GTSRB test images captured by a Raspberry Pi camera. Our results demonstrate that its attack success rate surpasses previous direct mapping techniques by 10.37% and 13.07%, respectively for ResNet18 and MobileNetv2 in untargeted misclassification attack tasks with greater stealthiness when the adversarial examples are displayed on an LCD monitor for comparison. More importantly, it can achieve the targeted misclassification at an attack success rate of 95.09 % and 51.98 % respectively, which is currently impossible with existing camera-link attack methods. The results are r
暂无评论