作者:
Wanjari, KetanVerma, Prateek
Department of Computer Science and Engineering Faculty of Engineering and Technology Maharashtra Wardha442001 India
Department of Artificial Intelligence and Data Science Faculty of Engineering and Technology Maharashtra Wardha442001 India
Modern image recognition has experienced dramatic improvements because of Machine Learning and Deep Learning algorithms together. This study investigates CNNs and SVMs for recognition enhancement while reviewing image...
详细信息
—This paper is based on the application of lossless compression algorithms in image compression, aiming to solve problems such as insufficient storage space, low transmission efficiency, and heavy data processing bur...
详细信息
Medical imaging is crucial for heart diagnosis, but outdated algorithms and hardware result in delayed processing and low accuracy. Using NVIDIA Clara, a GPU-accelerated platform, the study proposes real-time cardiac ...
详细信息
The field of artificial intelligence (AI) holds a variety of algorithms designed with the goal of achieving high accuracy at low computational cost and latency. One popular algorithm is the vision transformer (ViT), w...
详细信息
ISBN:
(纸本)9798350383638;9798350383645
The field of artificial intelligence (AI) holds a variety of algorithms designed with the goal of achieving high accuracy at low computational cost and latency. One popular algorithm is the vision transformer (ViT), which excels at various computer vision tasks for its ability to capture long-range dependencies effectively. This paper analyzes a computing paradigm, namely, spatial transformer networks (STN), in terms of accuracy and hardware complexity for image classification tasks. The paper reveals that for 2D applications, such as image recognition and classification, STN is a great backbone for AI algorithms for its efficiency and fast inference time. This framework offers a promising solution for efficient and accurate AI for resource-constrained Internet of Things (IoT) and edge devices. The comparative analysis of STN implementations on the central processing unit (CPU), Raspberry Pi (RPi), and Resistive Random Access Memory (RRAM) architectures reveals nuanced performance variations, providing valuable insights into their respective computational efficiency and energy utilization.
Many institutions have recently embraced biometric security solutions, utilizing biological measurements to safeguard against fraudulent activities, theft, and various security threats. Face recognition technology hol...
详细信息
ISBN:
(纸本)9798350350708;9798350350715
Many institutions have recently embraced biometric security solutions, utilizing biological measurements to safeguard against fraudulent activities, theft, and various security threats. Face recognition technology holds a pivotal role within the realm of bio-metric security systems, serving purposes such as authentication, monitoring, individual identification, and identity verification. This article aims to delve into the examination of facial recognition systems grounded in deep learning. This focus arises due to the intricate nature of the process and the existence of numerous hurdles and variables that impact algorithm performance. The objective here is to illuminate the foremost challenges that real-world systems encounter, often overlooked in previous research. Additionally,under these challenges, the article will conduct a comparative analysis of the performance of prominent facial recognition algorithms, namely VGGFace, FaceNet, and ArcFace. This academic approach will allow to make informed choices when selecting the most suitable algorithms for specific applications.
This paper addresses the growing interest in deploying deep learning models directly in-sensor. We present "Q-Segment", a quantized real-time segmentation algorithm, and conduct a comprehensive evaluation on...
详细信息
ISBN:
(纸本)9798350383638;9798350383645
This paper addresses the growing interest in deploying deep learning models directly in-sensor. We present "Q-Segment", a quantized real-time segmentation algorithm, and conduct a comprehensive evaluation on a low-power edge vision platform with an in-sensors processor, the Sony IMX500. One of the main goals of the model is to achieve end-to-end image segmentation for vessel-based medical diagnosis. Deployed on the IMX500 platform, Q-Segment achieves ultra-low inference time in-sensor only 0.23 ms and power consumption of only 72mW. We compare the proposed network with state-of-the-art models, both float and quantized, demonstrating that the proposed solution outperforms existing networks on various platforms in computing efficiency, e.g., by a factor of 75x compared to ERFNet. The network employs an encoder-decoder structure with skip connections, and results in a binary accuracy of 97.25 % and an Area Under the Receiver Operating Characteristic Curve (AUC) of 96.97 % on the CHASE dataset. We also present a comparison of the IMX500 processing core with the Sony Spresense, a low-power multi-core ARM Cortex-M microcontroller, and a single-core ARM Cortex-M4 showing that it can achieve in-sensor processing with end-to-end low latency (17 ms) and power consumption (254mW). This research contributes valuable insights into edge-based image segmentation, laying the foundation for efficient algorithms tailored to low-power environments. [GRAPHICS] .
Recent advancements in artificial intelligence algorithms for medical imaging show significant potential in automating the detection of lung infections from chest radiograph scans. However, current approaches often fo...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Recent advancements in artificial intelligence algorithms for medical imaging show significant potential in automating the detection of lung infections from chest radiograph scans. However, current approaches often focus solely on either 2-D or 3-D scans, failing to leverage the combined advantages of both modalities. Moreover, conventional slice-based methods place a manual burden on radiologists for slice selection. To overcome these challenges, we propose the Recurrent 3-D Multi-level Vision Transformer (R3DM-ViT) model, capable of handling multimodal data to enhance diagnostic accuracy. Our quantitative evaluations demonstrate that R3DM-ViT surpasses existing methods, achieving an impressive accuracy of 96.67%, F1-score of 96.88%, mean average precision of 96.75%, and mean average recall of 97.02%. This research signifies a significant stride forward in the automated detection of lung infections through multimodal imaging.
Hyperspectral image unmixing estimates a collection of constituent materials (called endmembers) and their corresponding proportions (called abundances), which is a critical preprocessing step in many remote sensing a...
详细信息
imageprocessing is a technique that involves the manipulation and enhancement of digital images. It encompasses various aspects such as image acquisition, image preprocessing, image enhancement, image segmentation, f...
详细信息
This work compares two different approaches to imageprocessing algorithm implementation in Zynq Zybo and Zedboard Field Programmable Gate Array (FPGA) boards. There are three main phases for the study, namely, Hardwa...
详细信息
暂无评论