In order to improve the performance of applications on OpenMP/JIAJIA, we present a new abstraction, Array Relation Vector (ARV), to describe the relation between the data elements of two consistent shared arrays acces...
详细信息
ISBN:
(纸本)0769524052
In order to improve the performance of applications on OpenMP/JIAJIA, we present a new abstraction, Array Relation Vector (ARV), to describe the relation between the data elements of two consistent shared arrays accessed in one computation phase. Based on ARV, we use array grouping to eliminate the pseudo data distributing of small shared data and improve the page locality. Experimental results show that ARV-based array grouping can greatly improve the performance of applications with non-continuous data access and strict access affinity on OpenMP/JIAJIA cluster. For applications with small shared arrays, array grouping can improve the performance obviously when the processor number is small.
The publish/subscribe(pub/sub)paradigm is a popular communication model for data dissemination in large-scale distributed ***,scalability comes with a contradiction between the delivery latency and the memory *** one ...
详细信息
The publish/subscribe(pub/sub)paradigm is a popular communication model for data dissemination in large-scale distributed ***,scalability comes with a contradiction between the delivery latency and the memory *** one hand,constructing a separate overly per topic guarantees real-time dissemination,while the number of node degrees rapidly increases with the number of *** the other hand,maintaining a bounded number of connections per node guarantees small memory cost,while each message has to traverse a large number of uninterested nodes before reaching the *** this paper,we propose Feverfew,a coverage-based hybrid overlay that disseminates messages to all subscribers without uninterested nodes involved in,and increases the average number of node connections slowly with an increase in the number of subscribers and *** major novelty of Feverfew lies in its heuristic coverage mechanism implemented by combining a gossip-based sampling protocol with a probabilistic searching *** on the practical workload,our experimental results show that Feverfew significantly outperforms existing coverage-based overlay and DHT-based overlay in various dynamic network environments.
As the fourth passive circuit component, a memristor is a nonlinear resistor that can "remember" the amount of charge passing through it. The characteristic of "remembering" the charge and non-volatility makes mem...
详细信息
As the fourth passive circuit component, a memristor is a nonlinear resistor that can "remember" the amount of charge passing through it. The characteristic of "remembering" the charge and non-volatility makes memristors great potential candidates in many fields. Nowadays, only a few groups have the ability to fabricate memristors, and most researchers study them by theoretic analysis and simulation. In this paper, we first analyse the theoretical base and characteristics of memristors, then use a simulation program with integrated circuit emphasis as our tool to simulate the theoretical model of memristors and change the parameters in the model to see the influence of each parameter on the characteristics. Our work supplies researchers engaged in memristor-based circuits with advice on how to choose the proper parameters.
This paper presents a novel algorithm to detect null pointer dereference errors. The algorithm utilizes both of the must and may alias information in a compact way to improve the precision of the detection. Using may ...
详细信息
ISBN:
(纸本)9783540884781
This paper presents a novel algorithm to detect null pointer dereference errors. The algorithm utilizes both of the must and may alias information in a compact way to improve the precision of the detection. Using may alias information obtained by a fast flow- and context- insensitive analysis algorithm, we compute the must alias generated by the assignment statements and the must alias information is also used to improve the precision of the may alias. We can strong update more expressions using the must alias information, which will reduce the false positives of the detection for null pointer dereference. We have implemented our algorithm in the SUIF2 compiler infrastructure and the experiments results are as expected.
Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28...
详细信息
Based on 3 D-TCAD simulations, single-event transient(SET) effects and charge collection mechanisms in fully depleted silicon-on-insulator(FDSOI) transistors are investigated. This work presents a comparison between28-nm technology and 0.2-lm technology to analyze the impact of strike location on SET sensitivity in FDSOI devices. Simulation results show that the most SET-sensitive region in FDSOI transistors is the drain region near the gate. An in-depth analysis shows that the bipolar amplification effect in FDSOI devices is dependent on the strike locations. In addition, when the drain contact is moved toward the drain direction, the most sensitive region drifts toward the drain and collects more charge. This provides theoretical guidance for SET hardening.
FinFET technologies are becoming the mainstream process as technology scales down. Based on a 28-nm bulk p- FinFET device, we have investigated the fin width and height dependence of bipolar amplification for heavy-io...
详细信息
FinFET technologies are becoming the mainstream process as technology scales down. Based on a 28-nm bulk p- FinFET device, we have investigated the fin width and height dependence of bipolar amplification for heavy-ion-irradiated FinFETs by 3D TCAD numerical simulation. Simulation results show that due to a well bipolar conduction mechanism rather than a channel (fin) conduction path, the transistors with narrower fins exhibit a diminished bipolar amplification effect, while the fin height presents a trivial effect on the bipolar amplification and charge collection. The results also indicate that the single event transient (SET) pulse width can be mitigated about 35% at least by optimizing the ratio of fin width and height, which can provide guidance for radiation-hardened applications in bulk FinFET technology.
Deep reinforcement learning(RL)has become one of the most popular topics in artificial intelligence *** has been widely used in various fields,such as end-to-end control,robotic control,recommendation systems,and natu...
详细信息
Deep reinforcement learning(RL)has become one of the most popular topics in artificial intelligence *** has been widely used in various fields,such as end-to-end control,robotic control,recommendation systems,and natural language dialogue *** this survey,we systematically categorize the deep RL algorithms and applications,and provide a detailed review over existing deep RL algorithms by dividing them into modelbased methods,model-free methods,and advanced RL *** thoroughly analyze the advances including exploration,inverse RL,and transfer ***,we outline the current representative applications,and analyze four open problems for future research.
Single event upset (SEU) is one of the most important origins of soft errors in aerospace *** technology scales down persistently, charge sharing is playing a more and more significant effect on SEU of flip-flop. Char...
详细信息
Single event upset (SEU) is one of the most important origins of soft errors in aerospace *** technology scales down persistently, charge sharing is playing a more and more significant effect on SEU of flip-flop. Charge sharing can often bring about multi-node charge collection in storage nodes and non-storage nodes in a flip-flop. In this paper, multi-node charge collection in flip-flop data input and flip-flop clock signal is investigated by 3D TCAD mixed-mode simulations, and the simulate results indicate that single event double transient (SEDT) in flip-flop data input and flip-flop clock signal can also cause a SEU in flip-flop. This novel mechanism is called the SEDT-induced SEU, and it is also verified by heavy-ion experiment in 65 nm twin-well process. The simulation results also indicate that this mechanism is closely related with the well-structure,and the triple-well structure is more effective to increase the SEU threshold of this mechanism than twin-well structure.
With the rapid development of computing and networking technologies, people propose to build harmonious, trusted and transparent Internet-based virtual computing environments (iVCE). The overlay-based organization of ...
详细信息
With the rapid development of computing and networking technologies, people propose to build harmonious, trusted and transparent Internet-based virtual computing environments (iVCE). The overlay-based organization of dynamic Internet resources is an important approach for iVCE to realizing efficient resource sharing. DHT-based overlays are scalable, low-latency and highly available; however, the current DHT overlay (SKY) in iVCE cannot satisfy the "trust" requirements of Internet applications. To address this problem, in this paper we modify SKY and propose TrustedSKY, an embedded DHT overlay technique in iVCE which supports applications to select trusted nodes to form a "trusted subgroup" in the base overlay and realize secure and trusted DHT routing.
The proliferation of massive datasets has led to significant interests in distributed algorithms for solving large-scale machine learning ***,the communication overhead is a major bottleneck that hampers the scalabili...
详细信息
The proliferation of massive datasets has led to significant interests in distributed algorithms for solving large-scale machine learning ***,the communication overhead is a major bottleneck that hampers the scalability of distributed machine learning *** this paper,we design two communication-efficient algorithms for distributed learning *** first one is named EF-SIGNGD,in which we use the 1-bit(sign-based) gradient quantization method to save the communication ***,the error feedback technique,i.e.,incorporating the error made by the compression operator into the next step,is employed for the convergence *** second algorithm is called LE-SIGNGD,in which we introduce a well-designed lazy gradient aggregation rule to EF-SIGNGD that can detect the gradients with small changes and reuse the outdated ***-SIGNGD saves communication costs both in transmitted bits and communication ***,we show that LE-SIGNGD is convergent under some mild *** effectiveness of the two proposed algorithms is demonstrated through experiments on both real and synthetic data.
暂无评论