检索结果-内蒙古大学图书馆

IEEE International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)

作者： Mingyuan An Yang Wang Weiping Wang Ninghui Sun Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy and Sciences Beijing China University of the Chinese Academy of Sciences Beijing Beijing CN Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy and Sciences Beijing China

To obtain the efficiency of DBMS, HadoopDB combines Hadoop and DBMS, and claims the superiority over Hadoop in terms of performance. However, the approach of HadoopDB is simply putting Map Reduce onto unmodified single-machined DBMSs which has several obvious weaknesses. In essence, HadoopDB is a parallel DBMS with fault tolerance, which incurs unnecessary overhead due to the DBMS legacy. Instead of augmenting DBMS with Hadoop techniques, we propose a new system architecture integrating modified DBMS engines as a read-only execution layer into Hadoop, where DBMS plays a role of providing efficient read-only operators rather than managing the data. Besides the obtained efficiency from DBMS engine, there are other advantages. The modified DBMS engine is able to directly process data from the HDFS (Hadoop Distributed File system) files at the block level, which means that the data replication can be handled by HDFS naturally, and the block-level parallelism is easily achieved. The global index access mechanism is added according to the Map Reduce paradigm. The data loading speed is also guaranteed by directly writing the data into HDFS with simplified logic. Experiments show that our system outperforms both original Hadoop and HadoopDB styled system.

关键词： Engines Fault tolerance Fault tolerant systems Indexes Loading Parallel processing

来源：评论

学校读者我要写书评

暂无评论

Optimizing MPI Alltoall Communication of Large Messages in Multicore Clusters

Optimizing MPI Alltoall Communication of Large Messages in M...

引用

IEEE International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)

作者： Qiang Li Zhigang Huo Ninghui Sun Graduate University of Chinese Academy of Sciences Beijing China Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy of Sciences Beijing China Institute of Computing Technology Chinese Academy of Sciences Beijing CN

MPI All to all communication is widely used in many high performance computing (HPC) applications. In All to all communication, each process sends a distinct message to all other participating processes. In multicore clusters, processes within a node simultaneously contend for the same network resource of the node in All to all communication. However, many small synchronization messages are required in All to all communication of large messages. With the contention, their latency is orders of magnitude larger than that without contention. As a result, the synchronization overhead is significantly increased and accounts for a large proportion to the whole latency of All to all communication. In this paper, we analyse the considerable overhead of synchronization messages. Base on the analysis, an optimization is presented to reduce the number of synchronization messages from 3N to 2¡ÌN. Evaluations on a 240-core cluster show that the performance is improved by almost constant ratio, which is mainly determined by message size and independent of system scale. The performance of All to all communication is improved by 25% for 32K and 64K bytes messages. For FFT application, performance is improved by 20%.

关键词： Synchronization Protocols Multicore processing Receivers Bandwidth Program processors Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

Trajectory Design for Energy Harvesting UAV Networks: A Foraging Approach

Trajectory Design for Energy Harvesting UAV Networks: A Fora...

引用

IEEE Conference on Wireless Communications and Networking

作者： Xuanlin Liu Mingzhe Chen Sihua Wang Walid Saad Changchuan Yin Beijing Key Laboratory of Network System Architecture and Convergence Beijing Laboratory of Advanced Information Network Beijing University of Posts and Telecommunications Beijing China Department of Electrical Engineering Princeton University Princeton NJ USA Wireless@VT Bradley Department of Electrical and Computer Engineering Virginia Tech Blacksburg VA USA

ISBN: (数字)9781728131061

ISBN: (纸本)9781728131078

In this paper, the problem of trajectory design for energy harvesting unmanned aerial vehicles (UAVs) is studied. In the considered model, the UAV acts as a moving base station to serve the ground users, while collecting energy from the charging stations located at the center of a user group. Meanwhile, to serve ground users and harvest energy, the UAV must be examined and repaired regularly. In consequence, it is necessary to optimize the trajectory design of the UAV while jointly considering the maintenance costs, the number of users that are served by the UAV, and the energy consumption and harvesting. To capture the relationship among these factors, we first model the completion of service and the harvested energy as reward, and the energy consumption during the deployment as cost. Then, the deployment profitability is defined as the reward to the cost of the UAV trajectory. Based on this definition, the trajectory design problem is formulated as an optimization problem whose goal is to maximize the deployment profitability of the UAV. To solve this problem, a foraging algorithm is proposed to find the optimal trajectory so as to maximize the deployment profitability. The proposed algorithm can find the optimal trajectory for the UAV with a polynomial time complexity. Fundamental analysis shows that the proposed algorithm can achieve the maximal deployment profitability. Simulation results show that the proposed algorithm can effectively reduce the operation time and achieve up to 25.6% gain in terms of the deployment profitability compared to Q-learning algorithm.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Synchronized Variable Frequency Clock Scheme in Chip Multiprocessors

A Synchronized Variable Frequency Clock Scheme in Chip Multi...

引用

2008 IEEE International Symposium on Circuits and systems (ISCAS 2008), vol.10

作者： Qifei Fan Ge Zhang Weiwu Hu Department of Computer Science and Technology University of Science and Technology Hefei China Key Laboratory of Computer System and Architecture Chinese Academy of Sciences Beijing China

Dynamic voltage/frequency scaling (DVFS) has been widely applied to reduce the power dissipation of multi-cores processor. However, when applying DVFS, signals need to be synchronized between asynchronous clock domains with overhead of several cycles, which will result in performance penalty, and during frequency scaling the circuit cannot work. This paper proposes a novel variable frequency clock scheme in chip multiprocessors. In our scheme, processor cores running at different frequency can communicate with each other without the overhead of synchronizing signals. The results of simulation show that our scheme can achieve EDP improvement by 16.8percent, with only 3.6percent performance degradation.

关键词： Frequency synchronization Clocks Phase locked loops Circuits Frequency conversion Multicore processing Voltage control Degradation Dynamic voltage scaling Network-on-a-chip

来源：评论

学校读者我要写书评

暂无评论

Capacity-shared heterogeneous CMP cache

引用

Jisuanji Yanjiu yu Fazhan/computer Research and Development 2008年第5期45卷 877-885页

作者： Gao, Xiang Zhang, Longbing Hu, Weiwu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 China Department of Computer Science and Technology University of Science and Technology of China Hefei 230027 China

The characteristics of advanced integrated circuit technologies require architects to look for new ways to utilize large numbers of gates and mitigate the effects of high interconnect delays. Chip multiprocessors (CMPs) exploit increasing transistor counts by placing multiple processors on a single die. As the chip multiprocessors (CMPs) have become the trend of high performance microprocessors, the target workloads become more and more diversified. Due to the wire delay problem and diversity of applications, neither private nor shared caches can provide both large capacity and fast access in CMPs. A novel CMP cache design, the heterogeneous CMP cache (HCC) is presented, in which chips are constructed by tiles of two different categories. L2 caches of private tiles provide lowest hit latency and L2 cache of shared tiles increases the effective cache capacity for shared data. Incorporating indirect-index cache technology to share capacity between different hierarchies, HCC provide a both capacity-effective and access-fast on-chip memory subsystem. Detailed full-system simulations are used to analyze the HCC performance for various programs, including SPEC CPU2000, SPLASH2 and commercial workloads. The result shows that HCC improves performance by 16% for single-threaded benchmarks and 9% for multi-thread benchmarks. HCC is easy to implement and the design ideas will be used in the future multi-core processors of Godson series.

关键词： Microprocessor chips

来源：评论

学校读者我要写书评

暂无评论

Joint LED selection and precoding optimization for multiple-user multiple-cell VLC systems

arXiv

引用

arXiv 2021年

作者： Yang, Yang Yang, Yujie Chen, Mingzhe Feng, Chunyan Xia, Hailun Cui, Shuguang Poor, H. Vincent The Beijing Key Laboratory of Network System Architecture and Convergence School of Information and Communication Engineering Beijing University of Posts and Telecommunications Beijing100876 China The Department of Electrical and Computer Engineering Princeton University PrincetonNJ08544 United States The Chinese University of Hong Kong Shenzhen518172 China

This paper proposes a hybrid dimming scheme based on joint LED selection and precoding design (TASP-HD) for multiple-user (MU) multiple-cell (MC) visible light communications (VLC) systems. In TASP-HD, both the LED selection and the precoding of each cell can be dynamically adjusted to reduce the intra- and inter-cell interferences while satisfying illumination constraints. First, a MU-MC-VLC system model is established, and then a sum-rate maximization problem under dimming level and illumination uniformity constraints is formulated. In this studied problem, the indices of activated LEDs and precoding matrices are optimized, which result in a complex non-convex mixed integer problem. To solve this problem, the original problem is separated into two subproblems. The first subproblem, which maximizes the sum-rate of users via optimizing the LED selection with a given precoding matrix, is a mixed integer problem solved by the penalty method. With the optimized LED selection matrix, the second subproblem which focuses on the maximization of the sum-rate via optimizing the precoding matrix is solved by the Lagrangian dual method. Finally, these two subproblems are iteratively solved to obtain a convergent solution. Simulation results verify that in a typical indoor scenario under a dimming level of 70%, the mean bandwidth efficiency of TASPHD is 4.8 bit/s/Hz and 7.13 bit/s/Hz greater than AD and DD, respectively. © 2021, CC BY-SA.

关键词： Light emitting diodes

来源：评论

学校读者我要写书评

暂无评论

Data Correlation-Aware Resource Management in Wireless Virtual Reality (VR): An Echo State Transfer Learning Approach

arXiv

引用

arXiv 2019年

作者： Chen, Mingzhe Saad, Walid Yin, Changchuan Debbah, Mérouane Beijing Key Laboratory of Network System Architecture and Convergence Beijing University of Posts and Telecommunications Beijing100876 China Wireless@VT Bradley Department of Electrical and Computer Engineering Virginia Tech BlacksburgVA United States Mathematical and Algorithmic Sciences Lab Huawei France R and D Paris

Providing seamless connectivity for wireless virtual reality (VR) users has emerged as a key challenge for future cloud-enabled cellular networks. In this paper, the problem of wireless VR resource management is investigated for a wireless VR network in which VR contents are sent by a cloud to cellular small base stations (SBSs). The SBSs will collect tracking data from the VR users, over the uplink, in order to generate the VR content and transmit it to the end-users using downlink cellular links. For this model, the data requested or transmitted by the users can exhibit correlation, since the VR users may engage in the same immersive virtual environment with different locations and orientations. As such, the proposed resource management framework can factor in such spatial data correlation, so as to better manage uplink and downlink traffic. This potential spatial data correlation can be factored into the resource allocation problem to reduce the traffic load in both uplink and downlink. In the downlink, the cloud can transmit 360 contents or specific visible contents (e.g., user field of view) that are extracted from the original 360 contents to the users according to the users' data correlation so as to reduce the backhaul traffic load. In the uplink, each SBS can associate with the users that have similar tracking information so as to reduce the tracking data size. This data correlation-Aware resource management problem is formulated as an optimization problem whose goal is to maximize the users' successful transmission probability, defined as the probability that the content transmission delay of each user satisfies an instantaneous VR delay target. To solve this problem, a machine learning algorithm that uses echo state networks (ESNs) with transfer learning is introduced. By smartly transferring information on the SBS's utility, the proposed transfer-based ESN algorithm can quickly cope with changes in the wireless networking environment due to users' conten

关键词： Virtual reality

来源：评论

学校读者我要写书评

暂无评论

3-D projective moment invariants

引用

Journal of Information and Computational Science 2007年第2期4卷 821-828页

作者： Xu, Dong Li, Hua Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Acad. of Sci. Beijing 100080 China Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Acad. of Sci. Beijing 100080 China National Research Center for Intelligent Computing Systems Institute of Computing Technology Chinese Acad. of Sci. Beijing 100080 China Graduate University Chinese Acad. of Sci. Beijing 100080 China

2-D projective moment invariants were firstly proposed by Suk and Flusser in [12]. We point out here that there is a useless projective moment invariant which is equivalent to zero in their paper. 3-D projective moment invariants are generated theoretically by investigating the property of signed volume of a tetrahedron. The main part is the selection of permutation invariant cores for multiple integrals to generate independent and nonzero 3-D projective moment invariants. We give the conclusion that projective moment invariants don't exist strictly speaking because of their convergence problem.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Dependability evaluation system based on microprocessor function model

引用

Jisuanji Xuebao/Chinese Journal of computers 2008年第3期31卷 391-399页

作者： Zhang, Shi-Jian Xu, Tong Zhang, Long-Bing Hu, Wei-Wu Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Graduate University Chinese Academy of Sciences Beijing 100039 China

With the widespread adoption of embedded microprocessor-based systems in safety critical applications, such as aircrafts, spaceships and nuclear power plants, how to rapidly and conveniently evaluate these fault-tolerant mechanisms with low cost is an important problem. The traditional method requires a detailed hardware protocol to do evaluation, which lengthens evaluation period and increases the cost. A new dependability evaluation technique based on microprocessor function model is proposed, which can evaluate fault-tolerant mechanisms more rapidly, more conveniently and more economically than the conventional systems. As a case for study, the new system evaluates three fault-tolerant techniques;the software redundancy technique, the assertion validation technique and the instruction re-fetching and re-execution technique. The results show that the evaluation is reasonable.

关键词： Fault tolerance

来源：评论

学校读者我要写书评

暂无评论

Cache adaptive write allocate policy

引用

Jisuanji Yanjiu yu Fazhan/computer Research and Development 2007年第2期44卷 348-354页

作者： Huan, Dandan Li, Zusong Hu, Weiwu Liu, Zhiyong Key Laboratory of Computer System and Architecture Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China Graduate University Chinese Academy of Sciences Beijing 100049 China

The bandwidth becomes the major bottleneck of the performance improvement for modern microprocessors. A cache adaptive write allocate policy that improves the bandwidth of microprocessor significantly is proposed by investigating cache store misses. The cache adaptive write allocate policy collects fully modified blocks in miss queue. Fully modified blocks are written to lower level memory based on non-write allocate policy which can switch to write allocate policy adaptively. Compared with other cache store miss policies, the cache adaptive write allocate policy avoids unnecessary memory traffic, reduces cache pollution and decreases load and store queue full rate without increasing hardware overhead. Experiment results indicate that on average 62.6% memory bandwidth in STREAM benchmarks is improved by utilizing the cache adaptive write allocate policy. The performance of SPEC CPU 2000 benchmarks is also improved efficiently. The average IPC speedup is 5.9%.

关键词： Microprocessor chips

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：