Text entry tasks for emerging wireless Augmented Reality (AR) and Virtual Reality (VR) devices can be realized in many ways, one of the most promising methods is based on an Inertial Measurement Unit (IMU) sensor, whi...
详细信息
Text entry tasks for emerging wireless Augmented Reality (AR) and Virtual Reality (VR) devices can be realized in many ways, one of the most promising methods is based on an Inertial Measurement Unit (IMU) sensor, which resembles a human writing style and is called air-writing. An existing air-writing Deep neuralnetwork (DNN) based algorithm called FAirWrite achieves state-of-the-art accuracy. However, this algorithm is optimized only for accuracy without considering the implementation constraints. State-of-the-art implementation executes the algorithm in a cloud, which is associated with large communication latency; privacy issues with respect to data transfer; and a need for reliable internet connection. A solution that tackles all three challenges is executing the algorithm locally at the edge in the closest proximity to the sensor. However, inference at the edge is challenging due to limited memory and computing resources, which can impact accuracy and increase latency. Additionally, battery-powered edge devices must adhere to strict power and energy consumption limits. All these constraints collectively restrict the model size and computational complexity that can be deployed near the sensor. In this work, we explore various optimizations required to enable state-of-the-art FAirWrite algorithm for a real-world deployment scenario, i.e. executing on an edge device. We perform a multi-layer design-space exploration considering multiple levels of design hierarchy spanning from optimizations applied on algorithmic level down to hardware level, considering various deployment run-times and edge platforms including embedded micro-controller, embedded Central Processing Unit (CPU), embedded Graphics Processing Unit (GPU), neural Processing Unit (NPU), and Field-Programmable Gate Array (FPGA). The complexity reduction optimizations result in a 37 × smaller eFAirWrite model without any accuracy degradation. We propose and implement a custom hardware architecture of the
暂无评论