The proceedings contain 15 papers. The special focus in this conference is on Advances in Mobile computing and multimedia Intelligence. The topics include: Face to Face with Efficiency: Real-Time Face Recogn...
ISBN:
(纸本)9783031483479
The proceedings contain 15 papers. The special focus in this conference is on Advances in Mobile computing and multimedia Intelligence. The topics include: Face to Face with Efficiency: Real-Time Face Recognition Pipelines on embedded Devices;multi-camera Live Video Streaming over Wireless Network;effects of Deep Generative AutoEncoder Based Image Compression on Face Attribute Recognition: A Comprehensive Study;Implementation of a Video Game Controlled by Pressing the Upper Arm Using PPG Sensor;immerscape: Supporting the Creation of Immersive Soundscapes by Users in Cultural Heritage Contexts;analysis of Data Obtained from the Mobile Botnet;On the Impact of FFP2 Face Masks on Speaker Verification for Mobile Device Authentication;Blockchain-Enhanced IoHT: A Patient-Centric Internet of Healthcare Things Platform with Smart Contract-Driven Data Management;federated Learning for Collaborative Cybersecurity of Distributed Healthcare;does Use of Blink Interface Affect Number of Blinks When Reading Paper Books?;a Method for Stimuli Control of Carbonated Beverages by Estimating and Reducing Carbonation Level;a Knee Injury Prevention System by Continuous Knee Angle Recognition Using Stretch Sensors;Ubiquitous Mobile Application for Conducting Occupational Therapy in Children with ADHD.
Advancements in autonomous driving technologies leverage real-time computing and embedded systems to enable vehicles to make quick decisions based on dynamic road conditions. These increasingly complex systems face ri...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
Advancements in autonomous driving technologies leverage real-time computing and embedded systems to enable vehicles to make quick decisions based on dynamic road conditions. These increasingly complex systems face rising computational demands. This study introduces a preliminary approach to parallelizing autonomous driving applications using a high-performance many-core processor. By distributing tasks across multiple cores, the approach enhances concurrent execution, reduces conflicts, and minimizes resource contention, improving the efficiency and performance of autonomous driving systems.
In response to the increasing complexity of real-time embedded software, driven by the need for intelligent computation in constrained environments, multicore architectures have emerged as a promising solution. The ch...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
In response to the increasing complexity of real-time embedded software, driven by the need for intelligent computation in constrained environments, multicore architectures have emerged as a promising solution. The challenge lies in to choose an effective mapping of the real-time tasks to the various computational resources of these embedded boards, while ensuring real-time constraint satisfaction. To address this problem, our approach rests on two pillars. The first is a domain-specific language designed to capture hardware and software characteristics, constraints, and criteria in a clear and unambiguous manner. The second is a solver method based on Satisfiability Modulo Theories solver augmented with a lazy theory to handle real-time aspects. This method allows us to synthesize mappings that respect temporal constraints and optimize specific criteria, such as the power consumption of the embedded board.
To enhance the performance of autonomous driving, recent studies have been incorporating various tasks that require increasingly more computation. As computational demands increase, it is often difficult to achieve ti...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
To enhance the performance of autonomous driving, recent studies have been incorporating various tasks that require increasingly more computation. As computational demands increase, it is often difficult to achieve timely execution with the limited performance of onboard computing units alone. To address this issue, Vehicle Edge computing (VEC), which offloads computational workloads to the edge and retrieves the results back to the vehicle, is gaining significant attention. To achieve efficient offloaded analytics via VEC, it is crucial to comprehensively consider both of the computing and network conditions of the V2X systems, as well as the vehicle energy consumption and timely execution. However, current studies have not sufficiently addressed the comprehensive modeling of computational and network loads in these V2X systems. To deal with this, we propose a Cooperative Network-Computation Load Balancing Simulator for VEC.
Cache locking is a commonly used mechanism to improve both performance and predictability for embedded programs. Dynamic cache locking methods proposed in the literature, where the locked content is modified during ex...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
Cache locking is a commonly used mechanism to improve both performance and predictability for embedded programs. Dynamic cache locking methods proposed in the literature, where the locked content is modified during execution, require inserting locking and unlocking instructions in the program's code. In this paper, we introduce a novel hardware mechanism that leverages the LRU age bits to perform duration-based locking. Our proposed mechanism dynamically locks and unlocks cache lines for different durations at run-time, without the need to modify the program's code. We further devise a heuristic that analyzes a program's loop structure and selects the set of addresses to be locked in a L1 instruction cache alongside their locking durations. Evaluation results show that our duration-based locking mechanism achieves comparable results to the dynamic approach while substantially reducing the initialization overhead and avoiding program code modifications.
This paper presents a real-time embedded thermal imaging system architecture for compact, energy-efficient, high-quality imaging utilizing heterogeneous system-on-chip (SoC) and uncooled infrared focal plane arrays (I...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
This paper presents a real-time embedded thermal imaging system architecture for compact, energy-efficient, high-quality imaging utilizing heterogeneous system-on-chip (SoC) and uncooled infrared focal plane arrays (IRFPAs). Unlike previous systems that organized separate devices for complex image processing, our system provides integrated image processing support for robust sensor-to-surveillance. The image processing organizes two algorithm stacks: a non-uniformity correction stack to mitigate the distinctive noise vulnerabilities of uncooled IRFPAs, and an image enhancement stack including contrast enhancement and temporal noise filters. We optimized these algorithms for domain-specific factors, including asymmetric multiprocessing (AMP), cache organization, single instruction multiple data (SIMD) instructions, and very long instruction word (VLIW) architectures. The implementation on the TI TDA3x SoC demonstrates that our system can process 640x480, 60 frames per second (FPS) videos at a peak core load of 57.5% while consuming power less than 2.2 W for the entire system, denoting the possibility of processing the 1280x1024, 30 FPS videos from the cutting-edge uncooled IRFPAs. Additionally, our system improves power efficiency by 9.42% and 9.96% at 30 and 60 FPS, respectively, compared to the state-of-the-art when executing similar image processing algorithms.
By relying on ambient energy, battery-less devices significantly increase the autonomy of IoT devices, enabling maintenance-free operation in remote locations. However, due to the scarcity of ambient energy, these dev...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
By relying on ambient energy, battery-less devices significantly increase the autonomy of IoT devices, enabling maintenance-free operation in remote locations. However, due to the scarcity of ambient energy, these devices rely on capacitors to buffer energy, and alternate between power-off phases where the device is harvesting energy and computation bursts. In most existing techniques, the device resumes execution only when the capacitor is full. However, we argue that doing so is sub-optimal. Instead, we advocate that waking-up the device sooner may yield better performance since the microcontroller consumes less power when operating at lower voltage. To this extent, we introduce EARLYBIRD, a technique that automatically computes a fine-tuned wake-up voltage for each resume point. EARLYBIRD leverages static analysis to determine how much energy is needed before resuming from a given program location, and provides a runtime library to enforce the early wake-up strategy. We evaluated how EARLYBIRD improves existing checkpointing techniques and results show an increase in the number of benchmarks executed per minute of up to 5.65x.
Out-of-distribution (OOD) detectors can act as safety monitors in embedded cyber-physical systems by identifying samples outside a machine learning model's training distribution to prevent potentially unsafe actio...
详细信息
ISBN:
(纸本)9798350387964;9798350387957
Out-of-distribution (OOD) detectors can act as safety monitors in embedded cyber-physical systems by identifying samples outside a machine learning model's training distribution to prevent potentially unsafe actions. However, OOD detectors are often implemented using deep neural networks, which makes it difficult to meet real-time deadlines on embedded systems with memory and power constraints. We consider the class of variational autoencoder (VAE) based OOD detectors where OOD detection is performed in latent space, and apply quantization, pruning, and knowledge distillation. These techniques have been explored for other deep models, but no work has considered their combined effect on latent space OOD detection. While these techniques increase the VAE's test loss, this does not correspond to a proportional decrease in OOD detection performance and we leverage this to develop lean OOD detectors capable of real-time inference on embedded CPUs and GPUs. We propose a design methodology that combines all three compression techniques and yields a significant decrease in memory and execution time while maintaining AUROC for a given OOD detector. We demonstrate this methodology with two existing OOD detectors on a Jetson Nano and reduce GPU and CPU inference time by 20% and 28% respectively while keeping AUROC within 5% of the baseline.
Approximate computing is a promising paradigm for improving the power efficiency of electronic systems for errortolerant applications such as multimedia processing, image multiplication, and machine learning. In appro...
详细信息
ISBN:
(纸本)9798350384406
Approximate computing is a promising paradigm for improving the power efficiency of electronic systems for errortolerant applications such as multimedia processing, image multiplication, and machine learning. In approximate computing, the accuracy of the computation is intentionally sacrificed to achieve lower power consumption and/or area. This study proposes a hybrid CMOS-memristor circuit design for approximate computing. The proposed design combines the advantages of CMOS technology, such as high scalability and flexibility, with the advantages of memristor technology, such as low power consumption and high density. The study first presents the design of fundamental logic gates using the hybrid CMOS-memristor approach. It then presents the design of six different 4-2 compressors with varying accuracy-performance tradeoffs. Finally, it presents the design of an 8x8 multiplier using the implemented compressors. The results show that using the hybrid CMOS-memristor circuit design achieves significant improvements in power efficiency over traditional CMOS designs. For example, the 8x8 multiplier design achieves up to 88% lower power consumption and uses up to 50% fewer transistors than a traditional CMOS design. The study also evaluated the proposed design for two neural network applications (LeNet5 and ResNet18) and three image processing applications. The results show that the proposed design can achieve significant improvements in power efficiency for these applications as well.
The proceedings contain 73 papers. The topics discussed include: leveraging ensemble learning for dry beans classification;portable edge computing system analysis for real-time detection and classification of wheat ye...
ISBN:
(纸本)9798350349719
The proceedings contain 73 papers. The topics discussed include: leveraging ensemble learning for dry beans classification;portable edge computing system analysis for real-time detection and classification of wheat yellow rust infection types using embedded AI;NanoCNN: a parameters efficient network for traffic sign recognition;a heuristic multimedia verticals aggregated search approach and user behavioral analysis;intelligent diagnosis of gasoline engine faults using acoustic features;modeling and control of boost converter: switched, averaged, and state-space comparison;harnessing long short term memory networks for stock market forecasts;enhancing misogyny detection in bilingual texts using FastText and explainable AI;and detection of ophthalmic disorders through deep feature fusion.
暂无评论