Edge appliances built with machine learning applications have been gradually adopted in a wide variety of application fields, such as intelligent transportation, the banking industry, and medical diagnosis. Privacy-pr...
详细信息
ISBN:
(数字)9781665453448
ISBN:
(纸本)9781665453448
Edge appliances built with machine learning applications have been gradually adopted in a wide variety of application fields, such as intelligent transportation, the banking industry, and medical diagnosis. Privacy-preserving computation approaches can be used on smart appliances in order to secure the privacy of sensitive data, including application data and the parameters of machine learning models. Nevertheless, the data privacy is achieved at the cost of execution time. that is, the execution speed of a secure machine learning application is several orders of magnitude slower than that of the application in plaintext. Especially, the performance gap is enlarged for edge appliances. In this work, in order to improve the execution efficiency of secure applications, an open-source software framework CrypTen is targeted, which is widely used for building secure machine learning applications using the Secure Multi-Party Computation (SMPC) based privacy-preserving computation approach. We analyze the performance characteristics of the secure machine learning applications built with CrypTen, and the analysis reveals that the communication overhead hinders the execution of the secure applications. To tackle the issue, a communication library, OpenMPI, is added to the CrypTen framework as a new communication backend to boost the application performance by up to 50%. We further develop a hybrid communication scheme by combining the OpenMPI backend withthe original communication backend withthe CrypTen framework. the experimental results show that the enhanced CrypTen framework is able to provide better performance for the small-size data (LeNet5 on MNIST dataset by up to 50% of speedup) and maintain similar performance for large-size data (AlexNet on CIFAR-10), compared to the original CrypTen framework.
the paper addresses several important issues regarding the performance of non-orthogonal multiple access systems in terms of sum rate capability with numerous users divided among several clusters sharing the same chan...
详细信息
Streams for video conferencing and interactive streaming applications. By using exploiting the spatial and temporal redundancies of stereoscopic video sequences, the proposed technique leverages GPUs to technique and ...
详细信息
Execution of resource-intensive tasks, such as artificial intelligence (AI), big-data algorithms, video processing, etc. is a common requirement in distributed embeddedsystems today. the typical solution is to execut...
详细信息
the SSE-YOLO model, optimized for object detection, has exhibited exceptional effectiveness in fire detection tasks, particularly within environments where computational resources are limited, such as embedded devices...
详细信息
In the world of real-timesystems (RTS), security has often been overlooked in the design process. However, withthe emergence of the Internet of things and Cyber-Physical systems, RTS are now frequently used in inter...
详细信息
Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection...
详细信息
ISBN:
(纸本)9783031821523;9783031821530
Identifying and locating objects in images and videos, including elements like traffic signs, vehicles, buildings, and people, constitutes a fundamental and demanding task in computer vision, known as object detection. Due to the higher computing complexity of this technique and the large amount of data carried by the video signal, it is nearly impossible for ordinary general-purpose processors GPPs or CPUs to run these techniques in real-time, especially for embeddedsystemsapplications. therefore, special hardware that can acquire, control, or execute in parallel is required. these specialized hardware systems include Digital Signal Processors DSPs, Field Programmable Gate Arrays FPGAs, Visual Processing Units VPUs, Tensor Processing Units TPUs, Neural Processing Units NPUs or Graphics Processing Units GPUs. this work presents the benefits of accelerating traditional object detection methods on a high-end embedded system, the Jetson Nano Developer Kit. this single computer board is equipped withthe Tegra K1 System on Chip SoC, which is composed of a quad-core ARM A15 and 192 cores of Kepler-embedded GPU. computing acceleration was ensured via the use of the CUDA OpenCV library for boththe Histogram of Oriented Gradients HOG and the Haar Cascade Classifier. For VGA resolution, results reveal that the GPU implementation on this embedded system is 1.4x faster than the CPU for the HOG method and 2x for the Haar Cascade Classifier method.
there has been a total transformation in industrial environments as a result of the implementation of intelligent Internet of things (IoT) technologies, which have enabled better resource allocation, predictive mainte...
详细信息
the rate of progress in the field of Artificial Intelligence (AI) and Machine Learning (ML) has significantly increased over the past ten years and continues to accelerate. Since then, AI has made the leap from resear...
详细信息
ISBN:
(数字)9781665468220
ISBN:
(纸本)9781665468220
the rate of progress in the field of Artificial Intelligence (AI) and Machine Learning (ML) has significantly increased over the past ten years and continues to accelerate. Since then, AI has made the leap from research case studies to real production ready applications. the significance of this growth cannot be undermined as it catalyzed the very nature of computing. Conventional platforms struggle to achieve greater performance and efficiency, what causes a surging demand for innovative AI accelerators, specialized platforms and purpose-built computes. At the same time, it is required to provide solutions for assessment of ML platform performance in a reproducible and unbiased manner to be able to provide a fair comparison of different products. this is especially valid for Human System Interaction (HSI) systemsthat require specific data handling for low latency responses in emergency situations or to improve user experience, as well as for preserving data privacy and security by processing it locally. Taking it into account, this work presents a comprehensive guideline on preferred benchmarking criteria for evaluation of ML platforms that include both lower level analysis of ML models and system-level evaluation of the entire pipeline. In addition, we propose a Systematic Taxonomy of embedded Platforms (STEP) that can be used by the community and customers for better selection of specific ML hardware consistent withtheir needs for better design of ML-based HSI solutions.
the emerging field of ubiquitous computing leads to new application and processing speed requirements in various every day aspects. When it comes to image processing tasks, which require many computations, the limited...
详细信息
暂无评论