Digital Twin (DT) has been proposed as a key enabling technology to produce a digital clone of the physical world and facilitate the convergence of the physical and virtual worlds for building a real-world Metaverse. ...
详细信息
Objective: Accurate decoding of electroencephalogram (EEG) signals has become more significant for the brain-computer interface (BCI). Specifically, motor imagery and motor execution (MI/ME) tasks enable the control o...
详细信息
Continuous-flow microfluidic biochips (CFMBs) have become a hot research topic in recent years due to their ability to perform biochemical assays automatically and efficiently. For the first time, PathDriver+ takes th...
详细信息
Continuous-flow microfluidic biochips (CFMBs) have become a hot research topic in recent years due to their ability to perform biochemical assays automatically and efficiently. For the first time, PathDriver+ takes the requirements of the actual fluid transportation into account in the design process of CFMBs and implements the actual fluid transport and removal, and plans separate flow paths for each transport task, which have been neglected in previous work. However, PathDriver+ does not take full advantage of the flexibility of CFMBs routing because it only considers the optimization of flow channel length for the global routing in the mesh model, except for the detailed routing. In addition, PathDriver+ only considers the X architecture, while the existing work shows that the any-angle routing can utilize the routing resources more efficiently and shorten the flow channel length. To address the above issues, we propose a flow path-driven arbitrary angle routing algorithm, which can improve the utilization of routing resources and reduce the flow channel length while considering the actual fluid transportation requirements. The proposed algorithm constructs a search graph based on constrained Delaunay triangulation to improve the search efficiency of routing solutions while ensuring the routing quality. Then, a Dijkstra-based flow path routing method is used on the constructed search graph to generate a routing result with a short channel length quickly. In addition, in the routing process, channel reuse strategy and intersection optimization strategy are proposed for the flow path reuse and intersection number optimization problems, respectively, to further improve the quality of routing results. The experimental results show that compared with the latest work PathDriver+, the length of channels, the number of ports used, and the number of channel intersections are significantly reduced by 33.21%, 11.04%, and 44.79%, respectively, and the channel reuse rate is i
Network monitoring and measurement is an important part of realizing the network digital twin. However, it introduces the problem of high cost when obtaining the status data of physical networks. Therefore, to efficie...
详细信息
Graph convolutional networks (GCNs) are popular for a variety of graph learning tasks. ReRAM-based processing-in-memory (PIM) accelerators are promising to expedite GCN training owing to their in-situ computing capabi...
详细信息
ISBN:
(数字)9798331506476
ISBN:
(纸本)9798331506483
Graph convolutional networks (GCNs) are popular for a variety of graph learning tasks. ReRAM-based processing-in-memory (PIM) accelerators are promising to expedite GCN training owing to their in-situ computing capability. However, existing accelerators can be severely underutilized even with pipelines, due to the oversight of the skewed execution times of various GCN stages and the ignorance of skewed degrees of graph vertices. In this work, we propose GOPIM, a GCN-oriented pipeline optimization for PIM accelerators to expedite GCN training. First, GOPIM proposes an ML-based scheme that allocates crossbar resources to the most needed stages to streamline the overall pipeline. Second, GOPIM utilizes a selective vertex updating technique that evenly distributes vertices on crossbars by interleaved mapping. These techniques collectively reduce the overall execution time without losing much accuracy. We also provide a practical architecture design for GOPIM. Our experimental results show that, GoPIM achieves up to 191 × speedup and 16.1 × energy saving, compared to the state-of-the-art work.
Evidence serves as the basis for determining facts in the judicial trial process, and exploring the correlation between evidence has become an essential task. However, there is uncertainty and unreliability of evidenc...
详细信息
With the advancement of electronic design automation, continuous-flow microfluidic biochips have become one of the most promising platforms for biochemical experiments. This chip manipulates fluid samples in millilite...
详细信息
Federated learning (FL) has been widely used in medical image processing to protect data privacy, but it has issues with data heterogeneity. Personalized federated learning have emerged to tackle these issues but ofte...
详细信息
Neural Vector Search (NVS) has exhibited superior search quality over traditional key-based strategies for information retrieval tasks. An effective NVS architecture requires high recall, low latency, and high through...
详细信息
ISBN:
(纸本)9798331506476
Neural Vector Search (NVS) has exhibited superior search quality over traditional key-based strategies for information retrieval tasks. An effective NVS architecture requires high recall, low latency, and high throughput to enhance user experience and cost-efficiency. However, implementing NVS on existing neural network accelerators and vector search accelerators is sub-optimal due to the separation between the embedding stage and vector search stage at both algorithm and architecture levels. Fortunately, we unveil that Product Quantization (PQ) opens up an opportunity to break separation. However, existing PQ algorithms and accelerators still focus on either the embedding stage or the vector search stage, rather than both simultaneously. Simply combining existing solutions still follows the beaten track of separation and suffers from insufficient parallelization, frequent data access conflicts, and the absence of scheduling, thus failing to reach optimal recall, latency, and throughput. To this end, we propose a unified and efficient NVS accelerator dubbed NeuVSA based on algorithm and architecture co-design philosophy. Specifically, on the algorithm level, we propose a learned PQ-based unified NVS algorithm that consolidates two separate stages into the same computing and memory access paradigm. It integrates an end-to-end joint training strategy to learn the optimal codebook and index for enhanced recall and reduced PQ complexity, thus achieving smoother acceleration. On the architecture level, we customize a homogeneous NVS accelerator based on the unified NVS algorithm. Each sub-accelerator is optimized to exploit all parallelism exposed by unified NVS, incorporating a structured index assignment strategy and an elastic on-chip buffer to alleviate buffer conflicts for reduced latency. All sub-accelerators are coordinated using a hardware-aware scheduling strategy for boosted throughput. Experimental results show that the joint training strategy improves recall
Power systems serve as the fundamental infrastructure for the socioeconomic development of modern societies. Researching power systems can stimulate the growth of the power industry and contribute to the sustainable d...
详细信息
暂无评论