Motivation: Large scale deep neural network models use a lot of training data, evident from the large datasets curated for their training, such as ImageNet. Thus it is necessary to use high performance computing (HPC)...
详细信息
The rise and proliferation of Artificial Intelligence (AI) technologies are bringing transformative changes to various sectors, signaling a new era of innovation in fields as diverse as medicine, manufacturing, and ev...
详细信息
ISBN:
(纸本)9798350371000;9798350370997
The rise and proliferation of Artificial Intelligence (AI) technologies are bringing transformative changes to various sectors, signaling a new era of innovation in fields as diverse as medicine, manufacturing, and even day-to-day social interactions. Notable advancements are not just confined to textual understanding, as seen in models like GPT, but also extend to visual cognition through image recognition and more. Beyond surface interactions and predictions, AI finds profound applications in life-saving domains such as medical diagnostics and becomes an integral part of daily life through chatbot-based customer interactions. However, as the horizon of AI expands, a crucial yet often overlooked aspect emerges- the underlying mission-critical infrastructure required to support and deploy these models effectively. The intricacies of efficient communication systems, foundational for real-time AI model operations, take center stage in ensuring the seamless functioning of AI-driven applications. This paper explores the quintessential changes needed in communication paradigms to keep pace with the evolving AI landscape. Specifically, we highlight the pivotal role of multipath communication in enhancing the responsiveness and efficiency of AI applications [1]. As a case in point, we investigate its impact on mission-critical operations in robotics. Through experimentation and analysis, the results elucidate the substantial benefits of this approach, revealing a significant improvement in delay metrics. This work underscores the imperative of aligning communication systems with the ever-growing demands of AI, ensuring that infrastructural capabilities do not lag in the race for innovation.
Structured dense matrices result from boundary integral problems in electrostatics and geostatistics, and also Schur complements in sparse preconditioners such as multi-frontal methods. Exploiting the structure of suc...
详细信息
ISBN:
(纸本)9798400708435
Structured dense matrices result from boundary integral problems in electrostatics and geostatistics, and also Schur complements in sparse preconditioners such as multi-frontal methods. Exploiting the structure of such matrices can reduce the time for dense direct factorization from O(N-3) to O(N). The Hierarchically Semi-Separable (HSS) matrix is one such low rank matrix format that can be factorized using a Cholesky-like algorithm called ULV factorization. The HSS-ULV algorithm is highly parallel because it removes the dependency on trailing sub-matrices at each HSS level. However, a key merge step that links two successive HSS levels remains a challenge for efficient parallelization. In this paper, we use an asynchronous runtime system PaRSEC with the HSS-ULV algorithm. We compare our work with STRUMPACK and LORAPO, both state-of-the-art implementations of dense direct low rank factorization, and achieve up to 2x better factorization time for matrices arising from a diverse set of applications on up to 128 nodes of Fugaku for similar or better accuracy for all the problems that we survey.
Breadth-first search (BFS) is a cornerstone in graph traversal, widely employed in areas such as social network analysis, routing algorithms, and biological network exploration. As the size of these graphs increases, ...
详细信息
The proceedings contain 49 papers. The special focus in this conference is on parallel and distributed Computing, Applications and Technologies. The topics include: A real-time routing protocol in wireless sensor-actu...
ISBN:
(纸本)9789811359064
The proceedings contain 49 papers. The special focus in this conference is on parallel and distributed Computing, Applications and Technologies. The topics include: A real-time routing protocol in wireless sensor-actuator network;privacy preserving classification based on perturbation for network traffic;fault diagnosis of a wireless sensor network using a hybrid method;an optimization theory of home occupants’ access data for determining smart grid service;automatic classification of transformed protocols using deep learning;covert timing channel design for uniprocessor real-time systems;parallelization of the DIANA algorithm in openMP;Flash animation watermarking algorithm based on SWF tag attributes;efficient scheduling strategy for data collection in delay-tolerant wireless sensor networks with a mobile sink;analysis of massive e-learning processes: An approach based on big association rules mining;SGNet: Design of optimized DCNN for real-time face detection;A study on L1 data cache bypassing methods for high-performance GPUs;Memory contention aware power management for high performance GPUs;Dynamic selective warp scheduling for GPUs using L1 data Cache locality information;an efficient model and algorithm for privacy-preserving trajectory data publishing;what makes charitable crowdfunding projects successful: A research based on data mining and social capital theory;A SwarmESB based architecture for an european healthcare insurance system in compliance with GDPR;a study on deriving and simulating pre-risk on complex gas facilities for preventing accidents;body gesture modeling for psychology analysis in job interview based on deep spatio-temporal approach;green vs revenue: Data center profit maximization under green degree constraints;evaluation for two bloom filters’ configuration.
The rapid rise in spatial data volumes from diverse sources necessitate efficient spatial data processing capability. Although most relational databases support spatial extensions of SQL query features, they offer lim...
详细信息
The aerospace industry is one of the largest users of numerical simulation, which is an essential tool in the field of aerodynamic engineering, where many fluid dynamics simulations are involved. In order to obtain th...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
The aerospace industry is one of the largest users of numerical simulation, which is an essential tool in the field of aerodynamic engineering, where many fluid dynamics simulations are involved. In order to obtain the most accurate solutions, some of these simulations use unstructured finite volume solvers that cope with irregular meshes by using explicit time-adaptive integration methods. Modern parallel implementations of these solvers rely on task-based runtime systems to perform fine-grained load balancing and to avoid unnecessary synchronizations. Although such implementations greatly improve performance compared to a classical fork-join MPI+OpenMP variants, it remains a challenge to keep all cores busy throughout the simulation loop. In this article, we first investigate the origins of this lack of parallelism. We emphasize that the irregular structure of the task graph plays a major role in the inefficiency of the computation distribution. Our main contribution is to improve the shape of the task graph by using a new mesh partitioning strategy. The originality of our approach is to take the temporal level of mesh cells into account during the mesh partitioning phase. We evaluate our approach by integrating our solution in an ArianeGroup production code used by Airbus. We show that our partitioning method leads to a more balanced task graph. The resulting task scheduling is up to two times faster for meshes ranging from 200,000 to 12,000,000 components.
This paper presents HPC-in-Containers, a novel containerized parallel computing environment using Docker. It is designed to facilitate learning parallel programming concepts, where users do not have to deploy a multic...
详细信息
The k-winners-take-all (k-WTA) network is a model based on competition. The extant literature on k-WTA models only deals with static undirected connected graphs. In the actual application scenario, the static undirect...
详细信息
The current Single-User Key Derivation (SKD) caters to individual management of blockchain's tree-structured assets but falls short for threshold signatures aimed at multi-party control of blockchain assets. We in...
详细信息
暂无评论