For the high-performancecomputing in a WAN environment,the geographical locations of national supercomputing centers are scattered and the network topology is complex,so it is difficult to form a unified view of *** ...
详细信息
For the high-performancecomputing in a WAN environment,the geographical locations of national supercomputing centers are scattered and the network topology is complex,so it is difficult to form a unified view of *** aggregate the widely dispersed storage resources of national supercomputing centers in China,we have previously proposed a global virtual data space named GVDS in the project of“highperformancecomputing Virtual Data Space”,a part of the National Key Research and Development Program of *** GVDS enables large-scale applications of the high-performancecomputing to run efficiently across ***,the applications running on the GVDS are often data-intensive,requiring large amounts of data from multiple supercomputing centers across *** this regard,the GVDS suffers from performance bottlenecks in data migration and access across *** solve the above-mentioned problem,this paper proposes a performance optimization framework of GVDS including the multitask-oriented data migration method and the request access-aware IO proxy resource allocation *** a WAN environment,the framework proposed in this paper can make an efficient migration decision based on the amount of migrated data and the number of multiple data sources,guaranteeing lower average migration latency when multiple data migration tasks are running in *** addition,it can ensure that the thread resource of the IO proxy node is fairly allocated among different types of requests(the IO proxy is a module of GVDS),so as to improve the application’s performance across *** experimental results show that the framework can effectively reduce the average data access delay of GVDS while improving the performance of the application greatly.
Retinopathy of Prematurity is a disease that affects premature infants having low birth weight. The disease may lead to blindness unless timely treatment is not provided. Because of the high birth rate premature babie...
详细信息
Faceted search on web pages needs exact facets. However, it is difficult to extract facets exactly from web pages because the web pages are unstructured and lack of facet information. Therefore, facet extraction is a ...
详细信息
Graphics Processing Units (GPUs) are designed to process numerically intensive applications and specially customized with additional computational power to achieve efficient rendering of 3D applications. Shaders are b...
详细信息
Graphics Processing Units (GPUs) are designed to process numerically intensive applications and specially customized with additional computational power to achieve efficient rendering of 3D applications. Shaders are basically the computer programs that run on graphics rendering pipeline and inform about how to process and render each pixel, to perform efficient rendering of 3D applications. Shaders included in our Geometric complexity, texture resolution, per-pixel lighting. These shaders expose GPUs to suffer more power utilization during rendering process. In this paper, initially GPU power usage is investigated at run-time for rendering configuration to define shader parameters on a 3D scene and then power prediction model is proposed at different shader calls on GPU to generate power-aware dynamically reconfigurable rendering model. In the implementation process we integrate the TensorFlow management library routines and dynamic voltage and frequency scheduling (DVFS) mechanism to improve efficiency in rendering process and reach optimal power savings. To analyze the power efficiency of GPUs, shader parameters such as geometric complexity, texture resolution, per-pixel lighting, filtering that expose more towards power consumption are experimented with common GPU workloads during rendering process, then frequency and voltage is altered based on the framerate monitoring. Also in addition to this the frame rate is monitored and maintained in such a way that GPU frequency is adjusted in parallel if the framerate is within the acceptable range so that unnecessary power usage is reduced. During the experimental tests a lower limit of 30 frames per second and upper limit of 60 frames per second is configured. The GPU frequency is reduced if the existing framerate is greater than upper configured limit and frequency range is increased if framerate is below the configured lower limit. Compared with the previous state of art, results show power savings around 40% and imp
In this paper, we present a scalable parallel framework, which employs grid computing technologies, for solving computationally expensive and intractable design problems. Using an aerodynamic airfoil design optimizati...
详细信息
This paper tackles the high computational/space complexity associated with multi-head self-attention(MHSA)in vanilla vision *** this end,we propose hierarchical MHSA(H-MHSA),a novel approach that computes self-attenti...
详细信息
This paper tackles the high computational/space complexity associated with multi-head self-attention(MHSA)in vanilla vision *** this end,we propose hierarchical MHSA(H-MHSA),a novel approach that computes self-attention in a hierarchical ***,we first divide the input image into patches as commonly done,and each patch is viewed as a ***,the proposed H-MHSA learns token relationships within local patches,serving as local relationship ***,the small patches are merged into larger ones,and H-MHSA models the global dependencies for the small number of the merged *** last,the local and global attentive features are aggregated to obtain features with powerful representation *** we only calculate attention for a limited number of tokens at each step,the computational load is reduced ***,H-MHSA can efficiently model global relationships among tokens without sacrificing fine-grained *** the H-MHSA module incorporated,we build a family of hierarchical-attention-based transformer networks,namely *** demonstrate the superiority of HAT-Net in scene understanding,we conduct extensive experiments on fundamental vision tasks,including image classification,semantic segmentation,object detection and instance ***,HAT-Net provides a new perspective for vision *** and pretrained models are available at https://***/yun-liu/HAT-Net.
This paper aims at the heavy load of thin client in cloud storage sharing due to traditional mutual authentication scheme with inherent public-key operation and interaction for many times, and proposes an Identity-Bas...
详细信息
The large-scale model training is typically slower and necessitates high-performance workstations and Distributed Deep Learning (DDL). The DDL models trained on a massive volume of data can outperform single accelerat...
详细信息
Commonsense knowledge representation and reasoning is key tor tasks such as artificial intelligence and natural language understanding. Since commonsense consists of information that humans take for granted, gathering...
详细信息
Serious games have recently shown great potential to be adopted in many applications, such as training and education. However, one critical challenge in developing serious games is the authoring of a large set of scen...
详细信息
暂无评论