Non-volatile (NV) devices are actively considered for compact and high performance memory architectures, especially in-memory computing (IMC) designs where processing and memory elements are co-located to address the ...
详细信息
Non-volatile (NV) devices are actively considered for compact and high performance memory architectures, especially in-memory computing (IMC) designs where processing and memory elements are co-located to address the memory wall issues for data-intensive applications. Content addressable memories (CAMs) are a form of IMC that compares the input query data against the stored data in parallel, and outputs the comparison result in terms of match or mismatch. Numerous CAMs have been proposed based on NV devices and demonstrate superior area, energy and performance metrics over the CMOS based conventional ones. Unlike the prior works that exploit the NV devices in the digital domain, in this paper, we proposed an analog CAM design, which utilizes the analog characteristics of Ferroelectrics field effect transistor (FeFET) to achieve a denser storage and search operations in analog domain. We illustrate our proposed analog CAM through a device-circuit co-design approach, and validate the 3-bit storage and search capability of the proposed design. The scalability of the proposed design is also examined. Evaluation results suggests that our analog CAM can achieve 22.4 × higher memory density, and 8.6 × higher energy efficiency compared with the conventional CMOS based design.
In the last decade, formal concept analysis (FCA) has received significant attention for knowledge processing tasks in various research fields. In the ontology construction of government domain, a major issue is that ...
详细信息
ISBN:
(纸本)9781728152103
In the last decade, formal concept analysis (FCA) has received significant attention for knowledge processing tasks in various research fields. In the ontology construction of government domain, a major issue is that the current methods are hard to discover and extract the implied concept of domain and relationship between them. Furthermore, it is hard to process the complicated realization. To overcome these issues a method is proposed in this paper based on FCA. In this process, an ontology prototype was constructed artificially to improve the accuracy of the topic concept and relationship. And FCA was used as a tool to mining the concept and relation to improve the comprehensiveness of construction of government domain ontology. Finally we proved the feasibility of the method. The analysis derived from the proposed method is also compared with other methods with an empirical analysis.
This paper concerns the real-time simulation of forests for a flight simulator, exploiting the capacities of recent graphics cards. As we will show, these architectures coupled with recent ergonomic environments like ...
详细信息
This paper concerns the real-time simulation of forests for a flight simulator, exploiting the capacities of recent graphics cards. As we will show, these architectures coupled with recent ergonomic environments like CUDA allow Gprogrammers to implement highly parallelizable algorithms to be executed on GPU, without being specialized in parallel programming. The first results exhibited are quite encouraging and promote these technologies. The algorithm devised can display forests with different tree densities, with performances near real-time, for scenes with about one million trees.
Nowadays, with the rapid development of Internet technology and application mode, the scale of Internet data is growing geometrically, and a large number of valuable knowledge are also in it. How to organize and expre...
详细信息
Nowadays, with the rapid development of Internet technology and application mode, the scale of Internet data is growing geometrically, and a large number of valuable knowledge are also in it. How to organize and express these knowledge and carry out in-depth calculation and analysis have aroused widespread concern. Grow up under the environment. Knowledge reasoning based on knowledge graph is one of the hotspots in knowledge graph research. It plays an important role in vertical search, intelligent answering and other applications. Knowledge reasoning oriented to knowledge graph aims at reasoning out new knowledge or identifying wrong knowledge based on existing knowledge. Unlike traditional knowledge reasoning, the knowledge reasoning method facing knowledge graph is also more diverse because of the concise, flexible and flexible knowledge representation in knowledge graph. This article will be based on the basic knowledge reasoning. The concept of knowledge based knowledge reasoning is introduced in recent years. In particular, knowledge reasoning can be divided into single step and multistep reasoning, because the methods are not the same. Each class can be divided into rule based reasoning, distributed representation, reasoning based on the divine network, and mixed reasoning.
A DVE system provides a computer-generated virtual world where individuals located at different places could interact with each other. In this paper, we present the design of a grid-enabled service oriented framework ...
ISBN:
(纸本)9783540241287
A DVE system provides a computer-generated virtual world where individuals located at different places could interact with each other. In this paper, we present the design of a grid-enabled service oriented framework for facilitating the building of DVE systems on Grid. A service component named “gamelet” is proposed. Each gamelet is characterized by its load awareness, high mobility, and embedded synchronization. Based on gamelet, we show how to re-design the existing monopolistic model of a DVE system into an open and service-oriented system that can fit into current Grid/OGSA framework. We also demonstrate an adaptive gamelet load-balancing (AGL) algorithm that helps the DVE system achieve better performance. We evaluate the performance through a multiplayer online game prototype implemented on Globus Toolkit. Results show that our approach can achieve faster response time and higher throughput.
A VOD server's caching and scheduling performance determine its service performance efficiency. This paper describes a new cache model and content replacement strategy, based on the Zipf-like Law and the character...
ISBN:
(纸本)9783540241287
A VOD server's caching and scheduling performance determine its service performance efficiency. This paper describes a new cache model and content replacement strategy, based on the Zipf-like Law and the characteristics of media stream service, which can reduce the disk I/O ratio by 6.22%. A performance analytical model for a disk load schedule was constructed, based on the Stochastic Process and Queuing Theory, and a new disk load strategy suitable for VOD systems was also formulated. This strategy reduces the disk block time by 3.71% on average. This paper also describes a content schedule scheme which was designed by constructing, analyzing and simplifying the SPN model deduced from the MSMQ theory. This scheme can guarantee the quality of service(QoS) and distribute program content automatically. An experiment was conducted, and the results showed that VOD servers embedded with the new cache and using the new schedule strategy could reduce the average response time to user requests by 7% to 19%.
Application of pattern-based approaches to parallel programming is an active area of research today. The main objective of pattern-based approaches to parallel programming is to facilitate the reuse of frequently occu...
ISBN:
(纸本)9783540241287
Application of pattern-based approaches to parallel programming is an active area of research today. The main objective of pattern-based approaches to parallel programming is to facilitate the reuse of frequently occurring structures for parallelism whereby a user supplies mostly the application specific code-components and the programming environment generates most of the code for parallelization. parallel Architectural Skeleton (PAS) is such a pattern-based parallel programming model and environment. The PAS model provides a generic way of describing the architectural/structural aspects of patterns in message-passing parallel computing. Application development using PAS ishierarchical, similar to conventional parallel programming using MPI, however with the added benefit of reusability and high level patterns. Like most other pattern-based parallel programming models, the benefits of PAS were offset by some of its drawbacks such as difficulty in: (1) extending PAS and (2) skeleton composition. SuperPAS is an extension of PAS that addresses these issues. SuperPAS provides a skeleton description language for the generic PAS. Using SuperPAS, a skeleton developer can extend PAS by adding new skeletons to the repository (i.e., extensibility). SuperPAS also makes the PAS system more flexible by defining composition of skeletons. In this paper, we describe SuperPAS and elaborate its use through examples.
A scalable parallel algorithm especially for large-scale three dimensional simulations with seriously non-uniform particles distributions is presented. In particular, based on cell-block data structures, this algorith...
ISBN:
(纸本)9783540241287
A scalable parallel algorithm especially for large-scale three dimensional simulations with seriously non-uniform particles distributions is presented. In particular, based on cell-block data structures, this algorithm uses Hilbert space filling curve to convert three-dimensional domain decomposition for load distribution across processors into one-dimensional load balancing problems for which measurement-based multilevel averaging weights(MAW) method can be applied successfully. Against inverse space-filling partitioning(ISP), MAW redistributes blocks by monitoring change of total load in each processor. Numerical experimental results have shown that MAW is superior to ISP in rendering balanced load for large-scale multi-medium MD simulation in high temperature and high pressure physics. Excellent scalability was demonstrated, with a speedup larger than 200 with 240 processors of one MPP. The largest run with 1.1 × 109 particles on 500 processors took 80 seconds per time step.
Data has never been as significant as it is today. It can be acquired virtually at will on any subject. Yet, this poses new challenges towards data management, especially in terms of storage (data is not consumed duri...
详细信息
ISBN:
(纸本)9781450395564
Data has never been as significant as it is today. It can be acquired virtually at will on any subject. Yet, this poses new challenges towards data management, especially in terms of storage (data is not consumed during processing, i. e., the data volume keeps growing), flexibility (new applications emerge), and operability (analysts are no IT experts). The goal has to be a demand-driven data provisioning, i. e., the right data must be available in the right form at the right time. Therefore, we introduce a tailorable data preparation zone for Data Lakes called BARENTS. It enables users to model in an ontology how to derive information from data and assign the information to use cases. The data is automatically processed based on this model and the refined data is made available to the appropriate use cases. Here, we focus on a resource-efficient data management strategy. BARENTS can be embedded seamlessly into established Big Data infrastructures, e. g., Data Lakes.
With the application of blockchain light nodes in embedded devices, how to alleviate computing pressure brought by complex operations such as transaction’s SPV Verification for CPU of embedded devices and improve the...
With the application of blockchain light nodes in embedded devices, how to alleviate computing pressure brought by complex operations such as transaction’s SPV Verification for CPU of embedded devices and improve the performance of devices in these aspects has gradually become a research topic in industry and academia. This paper proposes a series of methods to improve the performance of blockchain SPV Verification from the perspectives of system architecture and hash computing unit: (1) According to the computational characteristics of SPV Verification, this paper customizes macro instructions and microinstructions for the coprocessor to meet the requirements of flexibility; The built-in dedicated cache holds transaction data fetched from external memory and intermediate data generated by internal Hash Computing Unit, which not only prepares transaction data for hash computation, but also avoids frequent access to the bus and external memory. (2) Techniques like two-round unfolded computing, timing-balanced pipeline architecture and optimized adders are adopted to improve the performance of SHA256 computation. (3) When double hash computing is required for transactions, Hash Computing Unit can directly perform the second hash computation based on the first hash computation, reducing the frequency of accessing to external memory, thereby improving the performance to a certain extent. Through these methods, the performance of the hardware coprocessor for SVP verification of transactions is more than double that of traditional solutions.
暂无评论