Rehabilitation is a crucial process for patients suffering from motor disorders. The current practice is performing rehabilitation exercises under clinical expert supervision. New approaches are needed to allow patien...
详细信息
Rehabilitation is a crucial process for patients suffering from motor disorders. The current practice is performing rehabilitation exercises under clinical expert supervision. New approaches are needed to allow patients to perform prescribed exercises at their homes and alleviate commuting requirements, expert shortages, and healthcare costs. Human joint estimation is a substantial component of these programs since it offers valuable visualization and feedback based on body movements. Camera-based systems have been popular for capturing joint motion. However, they have high-cost, raise serious privacy concerns, and require strict lighting and placement settings. We propose a millimeter-wave (mmWave)-based assistive rehabilitation system (MARS) for motor disorders to address these challenges. MARS provides a low-cost solution with a competitive object localization and detection accuracy. It first maps the 5D time-series point cloud from mmWave to a lower dimension. Then, it uses a convolution neural network (CNN) to estimate the accurate location of human joints. MARS can reconstruct 19 human joints and their skeleton from the point cloud generated by mmWave radar. We evaluate MARS using ten specific rehabilitation movements performed by four human subjects involving all body parts and obtain an average mean absolute error of 5.87 cm for all joint positions. To the best of our knowledge, this is the first rehabilitation movements dataset using mmWave point cloud. MARS is evaluated on the Nvidia Jetson Xavier-NX board. Model inference takes only 64 mu s and consumes 442 mu J energy. These results demonstrate the practicality of MARS on low-power edge devices.
Flow-based microfluidic biochips can be used to perform bioassays by manipulating a large number of on-chip valves. These biochips are increasingly used today for biomolecular recognition, single-cell screening, and p...
详细信息
Flow-based microfluidic biochips can be used to perform bioassays by manipulating a large number of on-chip valves. These biochips are increasingly used today for biomolecular recognition, single-cell screening, and point-of-care disease diagnostics, and design-automation solutions for flow-based microfluidics enable the mapping and optimization of bimolecular protocols and software-based valve control. However, a key problem that has not received adequate attention is chip-to-world interfacing, which requires the use of off-chip control equipment to provide control signals for the on-chip valves. This problem is exacerbated by the increase in the number of valves as chips get more complex. To address the interfacing problem, we present an efficient pin-count minimization (synthesis) problem, referred to as Synterface, which uses on-chip microfluidic logic gates and optimization based on concepts from linear algebra. We present results to show that Synterface significantly reduces pin-count and simplifies the external interface for flow-based microfluidics.
Avionic software is the subject of stringent real time, determinism and safety constraints. software designers face several challenges, one of them being the interferences that appear in common situations, such as res...
详细信息
Avionic software is the subject of stringent real time, determinism and safety constraints. software designers face several challenges, one of them being the interferences that appear in common situations, such as resource sharing. The interferences introduce non-determinism and delays in execution time. One of the main interference prone resources are cache memories. In single-core processors, caches comprise multiple private levels. This breaks the isolation principle imposed by avionic standards, such as the ARINC-653. This standard defines partitioned architectures where one partition should never directly interfere with another one. In cache-based architectures, one partition can modify the cache content of another partition. In this paper, we propose a method based on cache locking to reduce the non-determinism and the contention on lower level memories while improving the time performances.
This paper describes a hardware and softwarecodesign approach for object detection and tracking algorithm which is based on modified ViBe background subtraction method and scale adaptable particle filtering algorithm...
详细信息
In this paper, we present our cache configuration prediction methodology offloaded to an FPGA for improved performance and hardware overhead reduction, while maintaining cache configuration predictions within 5% of th...
In this paper, we present our cache configuration prediction methodology offloaded to an FPGA for improved performance and hardware overhead reduction, while maintaining cache configuration predictions within 5% of the optimal energy cache configuration for application phases for the instruction and data caches.
Schedulers assign starting times to events in a system such that a set of constraints is met and system productivity is maximized. We characterize the scheduler behaviour for the case where decisions are made by compa...
详细信息
Schedulers assign starting times to events in a system such that a set of constraints is met and system productivity is maximized. We characterize the scheduler behaviour for the case where decisions are made by comparing affine expressions of design parameters such as task workload, processing speed, robot travelling speed, or a controller's rise and settling time. Deterministic schedulers can be extended with symbolic execution, to keep track of the affine conditions on the parameters for which the scheduling decisions are made. We introduce a divide-and-conquer algorithm that uses this information to determine parameter regions for which the same sequence of decisions is taken given a particular scenario. The results provide designers insight in the impact of parameter changes on the performance of their system. The exploration can also be executed with the KLEE symbolic execution engine of the LLVM tool chain to extract the same results. We show that the divide-and-conquer approach provides the results much faster than the generic symbolic execution engine of KLEE. The results allow visualization of the sensitivity to all parameter combinations. The results of our approach therefore provide more insight in the sensitivity to parameters.
Human activity recognition (HAR) has recently received significant attention due to its wide range of applications in health and activity monitoring. The nature of these applications requires mobile or wearable device...
详细信息
Human activity recognition (HAR) has recently received significant attention due to its wide range of applications in health and activity monitoring. The nature of these applications requires mobile or wearable devices with limited battery capacity. User surveys show that charging requirement is one of the leading reasons for abandoning these devices. Hence, practical solutions must offer ultra-low power capabilities that enable operation on harvested energy. To address this need, we present the first fully integrated custom hardware accelerator (HAR engine) that consumes 22.4 mu J per operation using a commercial 65 nm technology. We present a complete solution that integrates all steps of HAR, i.e., reading the raw sensor data, generating features, and activity classification using a deep neural network (DNN). It achieves 95% accuracy in recognizing 8 common human activities while providing three orders of magnitude higher energy efficiency compared to existing solutions.
The proceedings contain 13 papers. The topics discussed include: deriving equations from sensor data using dimensional function synthesis;a dual-mode strategy for performance-maximization and resource-efficient CPS de...
ISBN:
(纸本)9781450369244
The proceedings contain 13 papers. The topics discussed include: deriving equations from sensor data using dimensional function synthesis;a dual-mode strategy for performance-maximization and resource-efficient CPS design;coherent extension, composition, and merging operators in contract models for system design;efficient decentralized LTL monitoring framework using tableau technique;will my program break on this faulty processor? - formal analysis of hardware fault activations in concurrent embedded software;and timing-anomaly free dynamic scheduling of conditional DAG tasks on multi-core systems.
A rapid and accurate architectural simulator is a cornerstone for an emcient design-space exploration of computing systems. In this paper, we introduce EAST-DNN, a feed-forward deep neural network, to accelerate archi...
详细信息
A rapid and accurate architectural simulator is a cornerstone for an emcient design-space exploration of computing systems. In this paper, we introduce EAST-DNN, a feed-forward deep neural network, to accelerate architectural simulations. EAST-DNN achieves > 10 6 × speedup with an average prediction error of 4.3% over the baseline simulator. It also achieves an average of 2× better accuracy with at least 2.3× speedup compared to state-of-the-art.
We present a new method for deriving functions that model the relationship between multiple signals in a physical system. The method, which we call dimensional function synthesis, applies to data streams where the dim...
详细信息
We present a new method for deriving functions that model the relationship between multiple signals in a physical system. The method, which we call dimensional function synthesis, applies to data streams where the dimensions of the signals are known. The method comprises two phases: a compile-time synthesis phase and a subsequent calibration using sensor data. We implement dimensional function synthesis and use the implementation to demonstrate efficiently summarizing multi-modal sensor data for two physical systems using 90 laboratory experiments and 10 000 synthetic idealized measurements. We evaluate the performance of the compile-time phase of dimensional function synthesis as well as the calibration phase overhead, inference latency, and accuracy of the models our method generates. The results show that our technique can generate models in less than 300 ms on average across all the physical systems we evaluated. When calibrated with sensor data, our models outperform traditional regression and neural network models in inference accuracy in all the cases we evaluated. In addition, our models perform better in training latency (over 8660x improvement) and required arithmetic operations in inference (over 34x improvement). These significant gains are largely the result of exploiting information on the physics of signals that has hitherto been ignored.
暂无评论