检索结果-内蒙古大学图书馆

A GPU register file using static data compression

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Angerd, Alexandra Sintorn, Erik Stenström, Per Department of Computer Science and Engineering Chalmers University of Technology Göteborg Sweden

GPUs rely on large register files to unlock thread-level parallelism for high throughput. Unfortunately, large register files are power hungry, making it important to seek for new approaches to improve their utilization. This paper introduces a new register file organization for efficient register-packing of narrow integer and floating-point operands designed to leverage on advances in static analysis. We show that the hardware/software co-designed register file organization yields a performance improvement of up to 79%, and 18.6%, on average, at a modest output-quality degradation. Copyright © 2020, The Authors. All rights reserved.

关键词： Static analysis

DELTA: Distributed Locality-Aware Cache Partitioning for Tile-based Chip Multiprocessors

学校读者我要写书评

暂无评论

DELTA: Distributed Locality-Aware Cache Partitioning for Til...

International Symposium on Parallel and Distributed Processing (IPDPS)

作者： Nadja Holtryd Madhavan Manivannan Per Stenström Miquel Pericàs Department of Computer Science and Engineering Chalmers University of Technology Gotebörg Sweden

ISBN: (数字)9781728168760

ISBN: (纸本)9781728168777

Cache partitioning in tile-based CMP architectures is a challenging problem because of i) the need to determine capacity allocations with low computational overhead and ii) the need to place allocations close to where they are used, in order to reduce access latency. Although, previous solutions have addressed the problem of reducing the computational overhead and incorporating locality-awareness, they suffer from the overheads of centrally determining *** this paper, we propose DELTA, a novel distributed and locality-aware cache partitioning solution which works by exchanging asynchronous challenges among cores. The distributed nature of the algorithm coupled with the low computational complexity allows for frequent reconfigurations at negligible cost and for the scheme to be implemented directly in hardware. The allocation algorithm is supported by an enforcement mechanism which enables locality-aware placement of data. We evaluate DELTA on 16- and 64-core tiled CMPs with multi-programmed workloads. Our evaluation shows that DELTA improves performance by 9% and 16%, respectively, on average, compared to an unpartitioned shared last-level cache.

关键词： Resource management Partitioning algorithms Pain Proposals Throughput Heuristic algorithms Software algorithms

On estimating the entropy of shallow circuit outputs

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Gheorghiu, Alexandru Hoban, Matty J. Department of Computer Science and Engineering Chalmers University of Technology Sweden Department of Computer Science University of Oxford United Kingdom

Estimating the entropy of probability distributions and quantum states is a fundamental task in information processing. Here, we examine the hardness of this task for the case of probability distributions or quantum states produced by shallow circuits. Specifically, we show that entropy estimation for distributions or states produced by either log-depth circuits or constant-depth circuits with gates of bounded fan-in and unbounded fan-out is at least as hard as the Learning with Errors (LWE) problem, and thus believed to be intractable even for efficient quantum computation. This illustrates that quantum circuits do not need to be complex to render the computation of entropy a difficult task. We also give complexity-theoretic evidence that this problem for log-depth circuits is not as hard as its counterpart with general polynomial-size circuits, seemingly occupying an intermediate hardness regime. Finally, we discuss potential future applications of our work for quantum gravity research by relating our results to the complexity of the bulkto- boundary dictionary of AdS/CFT. © 2020, CC BY.

关键词： Quantum computers

A self-stabilizing control plane for the edge and fog ecosystems

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Georgiou, Zacharias Georgiou, Chryssis Pallis, George Schiller, Elad M. Trihinas, Demetris Department of Computer Science University of Cyprus Cyprus Computer Science and Engineering Chalmers University of Technology Sweden Department of Computer Science University of Nicosia Cyprus

Fog Computing is now emerging as the dominating paradigm bridging the compute and connectivity gap between sensing devices (a.k.a. "things") and latency-sensitive services. However, as fog deployments scale by accumulating numerous devices interconnected over highly dynamic and volatile network fabrics, the need for self-configuration and self-healing in the presence of failures is more evident now than ever. Using the prevailing methodology of self-stabilization, we propose a fault-tolerant framework for distributed control planes that enables fog services to cope and recover from a very broad fault model. Specifically, our model considers network uncertainties, packet drops, node fail-stop failures and violations of the assumptions according to which the system was designed to operate, such as an arbitrary corruption of the system state. Our self-stabilizing algorithms guarantee automatic recovery within a constant number of communication rounds without the need for external (human) intervention. To showcase the framework’s effectiveness, the correctness proof of the proposed self-stabilizing algorithmic process is accompanied by a comprehensive evaluation featuring an open and reproducible testbed utilizing real-world data from the intelligent transportation domain. Results show that our framework ensures a fog ecosystem recovery from faults in constant time, analytics are computed correctly, while the overhead to the system’s control plane scales linearly towards the IoT load. Copyright © 2020, The Authors. All rights reserved.

关键词： Fog

Convolutional spiking neural networks for spatio-temporal feature extraction

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Samadzadeh, Ali Tabatabaei Far, Fatemeh Sadat Javadi, Ali Nickabadi, Ahmad Chehreghani, Morteza Haghir Department of Computer Engineering and Information Technology Amirkabir University of Technology Tehran Iran Department of Computer Science and Engineering Chalmers University of Technology Gothenburg Sweden

Spiking neural networks (SNNs) can be used in low-power and embedded systems (such as emerging neuromorphic chips) due to their event-based nature. Also, they have the advantage of low computation cost in contrast to conventional artificial neural networks (ANNs), while preserving ANN’s properties. However, temporal coding in layers of convolutional spiking neural networks and other types of SNNs has yet to be studied. In this paper, we provide insight into spatio-temporal feature extraction of convolutional SNNs in experiments designed to exploit this property. Our proposed shallow convolutional SNN outperforms state-of-the-art spatio-temporal feature extractor methods such as C3D, ConvLstm, and similar networks. Furthermore, we present a new deep spiking architecture to tackle real-world problems (in particular classification tasks), and the model achieved superior performance compared to other SNN methods on CIFAR10-DVS. It is also worth noting that the training process is implemented based on spatio-temporal backpropagation, and ANN to SNN conversion methods will serve no use. Copyright © 2020, The Authors. All rights reserved.

关键词： Convolution

engineering AI Systems: A Research Agenda

学校读者我要写书评

暂无评论

arXiv 2020年

作者： Bosch, Jan Crnkovic, Ivica Olsson, Helena Holmström Chalmers University of Technology Department of Computer Science and Engineering Gothenburg Sweden Malmö University Department of Computer Science and Media Technology Malmö Sweden

Deploying machine-, and in particular deep-learning, (ML/DL) solutions in industry-strength, production quality contexts proves to challenging. This requires a structured engineering approach to constructing and evolving systems that contain ML/DL components. In this paper, we provide a conceptualization of the typical evolution patterns that companies experience when employing ML/DL well as a framework for integrating ML/DL components in systems consisting of multiple types of components. In addition, we provide an overview of the engineering challenges surrounding AI/ML/DL solutions and, based on that, we provide a research agenda and overview of open items that need to be addressed by the research community at large. Copyright © 2020, The Authors. All rights reserved.

关键词： Deep learning

Compositional Flows for 3D Molecule and Synthesis Pathway Co-design

学校读者我要写书评

暂无评论

arXiv 2025年

作者： Shen, Tony Seo, Seonghwan Irwin, Ross Didi, Kieran Olsson, Simon Kim, Woo Youn Ester, Martin School of Computing Science Simon Fraser University Canada Department of Chemistry KAIST Korea Republic of Molecular AI Discovery Sciences R&D As-traZeneca Department of Computer Science and Engineering Chalmers University of Technology Sweden Department of Computer Science University of Oxford United Kingdom NVIDIA United States

Many generative applications, such as synthesis-based 3D molecular design, involve constructing compositional objects with continuous features. Here, we introduce Compositional Generative Flows (CGFlow), a novel framework that extends flow matching to generate objects in compositional steps while modeling continuous states. Our key insight is that modeling compositional state transitions can be formulated as a straightforward extension of the flow matching interpolation process. We further build upon the theoretical foundations of generative flow networks (GFlowNets), enabling reward-guided sampling of compositional structures. We apply CGFlow to synthesizable drug design by jointly designing the molecule’s synthetic pathway with its 3D binding pose. Our approach achieves state-of-the-art binding affinity on all 15 targets from the LIT-PCBA benchmark, and 5.8× improvement in sampling efficiency compared to 2D synthesis-based baseline. To our best knowledge, our method is also the first to achieve state of-art-performance in both Vina Dock (-9.38) and AiZynth success rate (62.2%) on the CrossDocked benchmark. © 2025, CC BY.

关键词： Intellectual property core

Large-scale empirical study of electric vehicle usage patterns and charging infrastructure needs

学校读者我要写书评

暂无评论

npj Sustainable Mobility and Transport

npj Sustainable Mobility and Transport 2025年第1期2卷 1-10页

作者： Weipeng Zhan Junjun Deng Zhenpo Wang Yuan Liao Sonia Yeh National Engineering Laboratory for Electric Vehicles Beijing Institute of Technology Beijing China Department of Space Earth and Environment Chalmers University of Technology Gothenburg Sweden Department of Applied Mathematics and Computer Science Technical University of Denmark Lyngby Denmark

As global electric vehicle (EV) adoption accelerates, granular analysis of empirical usage and charging patterns remains scarce. This study presents a unique large-scale empirical examination of 1.6 million EVs, including a broad array of vehicle types—private, taxi, rental, official, bus, and special purpose vehicle—across seven major Chinese cities with over 854 million observations of driving and charging events. Our findings illuminate significant heterogeneity in EV usage, battery energy, and charging behavior across vehicle types with notable city differences. Day-time high-power charging presents high loads on the electricity grid across all vehicle types, particularly from service-oriented vehicles, including taxis, rental cars, and buses. The maximum loads also are the highest in the center of the cities. Our study of large-scale EV usage offers critical insights for developing charging infrastructure, managing energy grids, and providing flexibility services, which are pivotal to the evolution of future transport ecosystems.

关键词：

A more compact multi-id identity-based FHE scheme in the standard model and its applications

学校读者我要写书评

暂无评论

science China(Information sciences) 2019年第3期62卷 190-192页

作者： Xueqing WANG Biao WANG Bei LIANG Rui XUE State Key Laboratory of Information Security Institute of Information EngineeringChinese Academy of Sciences School of Cyber Security University of Chinese Academy of Sciences Department of Computer Science and Engineering Chalmers University of Technology

Dear editor,Fully homomorphic encryption (FHE) is a cryptographic primitive that allows anyone, even those without a secret key, to perform arbitrary computation on encrypted data. Since Gentry’s breakthrough realization of FHE in 2009 [1], the research on FHE has been blown out. Furthermore,López-Alt et al.[2] proposed a new notion of multi-

关键词： IBE CPA A more compact multi-id identity-based FHE scheme in the standard model and its applications IND