With the rapid development of computing and networking technologies, people propose to build harmonious, trusted and transparent Internet-based virtual computing environments (iVCE). The overlay-based organization of ...
详细信息
With the rapid development of computing and networking technologies, people propose to build harmonious, trusted and transparent Internet-based virtual computing environments (iVCE). The overlay-based organization of dynamic Internet resources is an important approach for iVCE to realizing efficient resource sharing. DHT-based overlays are scalable, low-latency and highly available; however, the current DHT overlay (SKY) in iVCE cannot satisfy the "trust" requirements of Internet applications. To address this problem, in this paper we modify SKY and propose TrustedSKY, an embedded DHT overlay technique in iVCE which supports applications to select trusted nodes to form a "trusted subgroup" in the base overlay and realize secure and trusted DHT routing.
Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grain...
详细信息
Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also in- troduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.
Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graph...
详细信息
Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graphs have many good properties such as constant degree, low congestion and optimal diameter. Due to the complexity in topology maintenance, however, to date there have been no effective P2P networks that are proposed based on Kautz graphs with base ~ 2. To address this problem, this paper presents the "distributed Kautz (D-Kautz) graphs", which adapt Kautz graphs to the characteristics of P2P networks. Using the D-Kautz graphs we further propose SKY, the first effective P2P network based on Kautz graphs with arbitrary base. The effectiveness of SKY is demonstrated through analysis and simulations.
The contribution of parasitic bipolar amplification to SETs is experimentally verified using two P-hit target chains in the normal layout and in the special layout. For PMOSs in the normal layout, the single-event cha...
详细信息
The contribution of parasitic bipolar amplification to SETs is experimentally verified using two P-hit target chains in the normal layout and in the special layout. For PMOSs in the normal layout, the single-event charge collection is composed of diffusion, drift, and the parasitic bipolar effect, while for PMOSs in the special layout, the parasitic bipolar junction transistor cannot turn on. Heavy ion experimental results show that PMOSs without parasitic bipolar amplification have a 21.4% decrease in the average SET pulse width and roughly a 40.2% reduction in the SET cross-section.
The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer ar...
详细信息
The key to large-scale parallel solutions of deterministic particle transport problem is single-node computation performance. Hence, single-node computation is often parallelized on multi-core or many-core computer architectures. However, the number of on-chip cores grows quickly with the scale-down of feature size in semiconductor technology. In this paper, we present a scalability investigation of one energy group time-independent deterministic discrete ordinates neutron transport in 3D Cartesian geometry(Sweep3D) on Intel's Many Integrated Core(MIC) architecture, which can provide up to 62 cores with four hardware threads per core now and will own up to 72 in the future. The parallel programming model, Open MP, and vector intrinsic functions are used to exploit thread parallelism and vector parallelism for the discrete ordinates method, respectively. The results on a 57-core MIC coprocessor show that the implementation of Sweep3 D on MIC has good scalability in performance. In addition, the application of the Roofline model to assess the implementation and performance comparison between MIC and Tesla K20 C Graphics processing Unit(GPU) are also reported.
We consider the maximal vector problem on uncertain data, which has been recently posed by the study on processing skyline queries over a probabilistic data stream in the database context. Let D n be a set of n points...
详细信息
We consider the maximal vector problem on uncertain data, which has been recently posed by the study on processing skyline queries over a probabilistic data stream in the database context. Let D n be a set of n points in a d-dimensional space and q (0 < q 1) be a probability threshold; each point in D n has a probability to occur. Our problem is concerned with how to estimate the expected size of the probabilistic skyline, which consists of all the points that are not dominated by any other point in D n with a probability not less than q. We prove that the upper bound of the expected size is O(min{n, (- ln q)(ln n) d-1 }) under the assumptions that the value distribution on each dimension is independent and the values of the points along each dimension are distinct. The main idea of our proof is to find a recurrence about the expected size and solve it. Our results reveal the relationship between the probability threshold q and the expected size of the probabilistic skyline, and show that the upper bound is poly-logarithmic when q is not extremely small.
Hierarchical Automata has been widely used in modeling dynamic aspects of reactive software, such as in UML Statecharts. At the same time, model checking is an automatic technique to ensure the correctness of software...
详细信息
A data-driven method was proposed to realistically animate garments on human poses in reduced space. Firstly, a gradient based method was extended to generate motion sequences and garments were simulated on the sequen...
详细信息
A data-driven method was proposed to realistically animate garments on human poses in reduced space. Firstly, a gradient based method was extended to generate motion sequences and garments were simulated on the sequences as our training data. Based on the examples, the proposed method can fast output realistic garments on new poses. Our framework can be mainly divided into offline phase and online phase. During the offline phase, based on linear blend skinning(LBS), rigid bones and flex bones were estimated for human bodies and garments, respectively. Then, rigid bone weight maps on garment vertices were learned from examples. In the online phase, new human poses were treated as input to estimate rigid bone transformations. Then, both rigid bones and flex bones were used to drive garments to fit the new poses. Finally, a novel formulation was also proposed to efficiently deal with garment-body penetration. Experiments manifest that our method is fast and accurate. The intersection artifacts are fast removed and final garment results are quite realistic.
In this paper, the effect of floating body effect (FBE) on a single event transient generation mechanism in fully depleted (FD) silicon-on-insulator (SOI) technology is investigated using three-dimensional techn...
详细信息
In this paper, the effect of floating body effect (FBE) on a single event transient generation mechanism in fully depleted (FD) silicon-on-insulator (SOI) technology is investigated using three-dimensional technology computer-aided design (3D- TCAD) numerical simulation. The results indicate that the main SET generation mechanism is not carder drift/diffusion but floating body effect (FBE) whether for positive or negative channel metal oxide semiconductor (PMOS or NMOS). Two stacking layout designs mitigating FBE are investigated as well, and the results indicate that the in-line stacking (IS) layout can mitigate FBE completely and is area penalty saving compared with the conventional stacking layout.
Building distributed applications is difficult mostly because of concurrency management. Existing approaches primarily include events and threads. Researchers and developers have been debating for decades to prove whi...
详细信息
Building distributed applications is difficult mostly because of concurrency management. Existing approaches primarily include events and threads. Researchers and developers have been debating for decades to prove which is superior. Although the conclusion is far from obvious, this long debate clearly shows that neither of them is perfect. One of the problems is that they are both complex and error-prone. Both events and threads need the programmers to explicitly manage concurrencies, and we believe it is just the source of difficulties. In this paper, we propose a novel approach—superscalar communication, in which concurrencies are automatically managed by the runtime system. It dynamically analyzes the programs to discover potential concurrency opportunities; and it dynamically schedules the communication and the computation tasks, resulting in automatic concurrent execution. This approach is inspired by the idea of superscalar technology in modern microprocessors, which dynamically exploits instruction-level parallelism. However, hardware superscalar algorithms do not fit software in many aspects, thus we have to design a new scheme completely from scratch. Superscalar communication is a runtime extension with no modification to the language, compiler or byte code, so it is good at backward compatibility. Superscalar communication is likely to begin a brand new research area in systems software, which is characterized by dynamic optimization for networking programs.
暂无评论