A web based education management system is established to develop education system by enhancing quality of education and teaching model. However, the existing resource allocation model and teaching in web-based educat...
详细信息
The real implementation of a recurrent neuralnetwork (RNN) in a low complexity IoT device is evaluated in order to predict the time series of power consumption in tertiary buildings. The RNN type long short-term memo...
详细信息
ISBN:
(纸本)9781538674628
The real implementation of a recurrent neuralnetwork (RNN) in a low complexity IoT device is evaluated in order to predict the time series of power consumption in tertiary buildings. The RNN type long short-term memory (LSTM) algorithm is adapted for a 32-bit microcontroller unit (MCU) and the backpropagation (BP) algorithm is implemented in-house. We therefore demonstrate that Intelligent IoT (IIoT) devices, such as the Espressif ESP32 MCU, not only implement neuralnetworks (NNs), but also learn on their own. The resulting IIoT architecture has been proven to operate efficiently and compared to the traditional computer-based learning platform. The selected results confirm that stand-alone IoT devices are a truly efficient solution that adds flexibility to the architecture, reduces storage and computation costs, and is more energy-friendly. As a conclusion, it is practically more efficient to exploit low-power and processing-time IIoT for our prediction use case rather than relying on server based distributed systems.
Multi-fingered hands could be used to achieve many dexterous manipulation tasks, similarly to humans, and tactile sensing could enhance the manipulation stability for a variety of objects. However, tactile sensors on ...
详细信息
Multi-fingered hands could be used to achieve many dexterous manipulation tasks, similarly to humans, and tactile sensing could enhance the manipulation stability for a variety of objects. However, tactile sensors on multi-fingered hands have a variety of sizes and shapes. Convolutional neuralnetworks (CNN) can be useful for processing tactile information, but the information from multi-fingered hands needs an arbitrary pre-processing, as CNNs require a rectangularly shaped input, which may lead to unstable results. Therefore, how to process such complex shaped tactile information and utilize it for achieving manipulation skills is still an open issue. This letter presents a control method based on a graph convolutional network (GCN) which extracts geodesical features from the tactile data with complicated sensor alignments. Moreover, object property labels are provided to the GCN to adjust in-hand manipulation motions. distributed tri-axial tactile sensors are mounted on the fingertips, finger phalanges and palm of an Allegro hand, resulting in 1152 tactile measurements. Training data is collected with a data-glove to transfer human dexterous manipulation directly to the robot hand. The GCN achieved high success rates for in-hand manipulation. We also confirmed that fragile objects were deformed less when correct object labels were provided to the GCN. When visualizing the activation of the GCN with a PCA, we verified that the network acquired geodesical features. Our method achieved stable manipulation even when an experimenter pulled a grasped object and for untrained objects.
neural mechanisms underlying semantic processing have been extensively studied by using functional magnetic resonance imaging, nevertheless, the individual differences of it are yet to be unveiled. To further our unde...
neural mechanisms underlying semantic processing have been extensively studied by using functional magnetic resonance imaging, nevertheless, the individual differences of it are yet to be unveiled. To further our understanding of functional and anatomical brain organization underlying semantic processing to the level of individual humans, we used out-of-scanner language behavioral data, T1, resting-state, and story comprehension task-evoked functional image data in the Human Connectome Project, to investigate individual variability in the task-evoked semantic processingnetwork, and attempted to predict individuals' language skills based on task and intrinsic functional connectivity of highly variable regions, by employing a machine-learning framework. Our findings first confirmed that individual variability in both functional and anatomical markers were heterogeneously distributed throughout the semantic processingnetwork, and that the variability increased towards higher levels in the processing hierarchy. Furthermore, intrinsic functional connectivities among these highly variable regions were found to contribute to predict individual reading decoding abilities. The contributing nodes in the overall network were distributed in the left superior, inferior frontal, and temporo-parietal cortices. Our results suggested that the individual differences of neurobiological markers were heterogeneously distributed in the semantic processingnetwork, and that neurobiological markers of highly variable areas are not only linked to individual variability in language skills, but can predict language skills at the individual level.
In the mobile driving scenario, insufficient data has become a major challenge for the application of scene text recognition models. An alternative to reduce the cost of data annotation is the active learning method, ...
详细信息
This paper presents a transformer neuralnetwork (TNN) model for estimating the relative pose of an autonomous unmanned aerial vehicle (UAV) with respect to a ship from monocular RGB (Red, Green, and Blue) camera imag...
详细信息
ISBN:
(数字)9781624107115
ISBN:
(纸本)9781624107115
This paper presents a transformer neuralnetwork (TNN) model for estimating the relative pose of an autonomous unmanned aerial vehicle (UAV) with respect to a ship from monocular RGB (Red, Green, and Blue) camera images. To address the challenge of collecting rich training data in ocean, synthetic images were generated by rendering a 3-D model of the ship with randomly distributed poses of the camera under various textured environments and backgrounds. These images were used to train our TNN model with the Detection Transformer (DETR) architecture, which identifies predefined keypoints of the ship in the images. Then, the relative pose with respect to the ship is determined by the perspective-n-point algorithm. The proposed model was evaluated successfully with in-situ flight experiments performed at a US Naval Academy research vessel. Compared to traditional methods, the proposed method achieved better accuracy and consistency, even in challenging scenarios with varying light conditions. Particularly, the estimated 6D pose predictions exhibit remarkable accuracy, specifically near the landing area, when compared with an independent position measurement from a real time kinematic GPS.
Deep neuralnetworks (DNNs) are currently making their way into a broad range of applications. While until recently they were mainly executed on high-performance computers, they are now also increasingly found in hard...
详细信息
ISBN:
(纸本)9783031236174;9783031236181
Deep neuralnetworks (DNNs) are currently making their way into a broad range of applications. While until recently they were mainly executed on high-performance computers, they are now also increasingly found in hardware platforms of edge applications. In order to meet the constantly changing demands, deployment of embedded Field Programmable Gate Arrays (FPGAs) is particularly suitable. Despite the tremendous advantage of high flexibility, embedded FPGAs are usually resource-constrained as they require more area than comparable Application-Specific Integrated Circuits (ASICs). Consequently, coexecution of a DNN on multiple platforms with dedicated partitioning is beneficial. Typical systems consist of FPGAs and Graphics processing Units (GPUs). Combining the advantages of these platforms while keeping the communication overhead low is a promising way to meet the increasing requirements. In this paper, we present an automated approach to efficiently partition DNN inference between an embedded FPGA and a GPU-based central compute platform. Our toolchain focuses on the limited hardware resources available on the embedded FPGA and the link bandwidth required to send intermediate results to the GPU. Thereby, it automatically searches for an optimal partitioning point which maximizes the hardware utilization while ensuring low bus load. For a low-complexity DNN, we are able to identify optimal partitioning points for three different prototyping platforms. On a Xilinx ZCU104, we achieve a 50% reduction of the required link bandwidth between the FPGA and GPU compared to maximizing the number of layers executed on the embedded FPGA, while hardware utilization on the FPGA is only reduced by 7.88% and 6.38%, respectively, depending on the use of DSPs and BRAMs on the FPGA.
Mangrove forests in Indonesia are distributed over many coastal regions, exhibiting significant biodiversity and diverse structural forms. Mangroves serve as a mitigating factor against wave action and intense wind en...
详细信息
This paper researches and analyses the effective e-commerce coordination big data processing strategy through the infinite-depth neuralnetwork topology. This paper firstly proposes a neuralnetwork training model Neu...
详细信息
ISBN:
(纸本)9781728176499
This paper researches and analyses the effective e-commerce coordination big data processing strategy through the infinite-depth neuralnetwork topology. This paper firstly proposes a neuralnetwork training model neuralnetwork-Storm (NN-S) based on Storm streaming distributed architecture, which decomposes the neuralnetwork training task into multiple computing units by data-parallel method, the parameters are updated synchronously after the training of a single batch of data is completed. In the Storm architecture, a Zookeeper network is used for multi-server distributed deployment. The training results show that the NN-S model can significantly improve the training speed of neuralnetworks. At the same time, the NN-S architecture can quickly recover from node failures and network resource scheduling abnormalities with strong robustness. In this paper, we investigate the streaming-based distributedneuralnetwork training and design a Storm-based distributedneuralnetwork training model and optimized training algorithms, which are of reference significance for distributedneuralnetwork training.
neuralprocessing units (NPUs) have become indispensable parts of mobile SoCs. Furthermore, integrating multiple NPU cores into a single chip becomes a promising solution for ever-increasing computing power demands in...
详细信息
ISBN:
(纸本)9798400701016
neuralprocessing units (NPUs) have become indispensable parts of mobile SoCs. Furthermore, integrating multiple NPU cores into a single chip becomes a promising solution for ever-increasing computing power demands in mobile devices. This paper addresses techniques to maximize the utilization of NPU cores and reduce the latency of on-device inference. Mobile NPUs typically have a small amount of local memory (or scratch pad memory, SPM) that provides space only enough for input/output tensors and weights of one layer operation in deep neuralnetworks (DNNs). Even in multicore NPUs, such local memories are distributed across the cores. In such systems, executing network layer operations in parallel is the primary vehicle to achieve performance. By partitioning a layer of DNNs into multiple sub-layers, we can execute them in parallel on multicore NPUs. Within a core, we can also employ pipelined execution to reduce the execution time of a sub-layer. In this execution model, synchronizing parallel execution and loading/storing intermediate tensors in global memory are the main bottlenecks. To alleviate these problems, we propose novel optimization techniques which carefully consider partitioning direction, execution order, synchronization, and global memory access. Using six popular convolutional neuralnetworks (CNNs), we evaluate our optimization techniques in a flagship mobile SoC with three cores. Compared to the highest-performing partitioning approach, our techniques improve performance by 23%, achieving a speedup of 2.1x over single-core systems.
暂无评论