Speech recognition is becoming prevalent in daily life. However, due to the similar semantic context of the entities and the overlap of Chinese pronunciation, the pronoun homophone, especially "他/她/它 (he/she/i...
详细信息
Die-stacked dynamic random access memory(DRAM)caches are increasingly advocated to bridge the performance gap between the on-chip cache and the main *** fully realize their potential,it is essential to improve DRAM ca...
详细信息
Die-stacked dynamic random access memory(DRAM)caches are increasingly advocated to bridge the performance gap between the on-chip cache and the main *** fully realize their potential,it is essential to improve DRAM cache hit rate and lower its cache hit *** order to take advantage of the high hit-rate of set-association and the low hit latency of direct-mapping at the same time,we propose a partial direct-mapped die-stacked DRAM cache called *** design is motivated by a key observation,i.e.,applying a unified mapping policy to different types of blocks cannot achieve a high cache hit rate and low hit latency *** address this problem,P3DC classifies data blocks into leading blocks and following blocks,and places them at static positions and dynamic positions,respectively,in a unified set-associative *** also propose a replacement policy to balance the miss penalty and the temporal locality of different *** addition,P3DC provides a policy to mitigate cache thrashing due to block type *** results demonstrate that P3DC can reduce the cache hit latency by 20.5%while achieving a similar cache hit rate compared with typical set-associative caches.P3DC improves the instructions per cycle(IPC)by up to 66%(12%on average)compared with the state-of-the-art direct-mapped cache—BEAR,and by up to 19%(6%on average)compared with the tag-data decoupled set-associative cache—DEC-A8.
Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity *** is of great significance to some NLP(natural language proc...
详细信息
Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity *** is of great significance to some NLP(natural language processing)tasks,such as question *** English entity linking,Chinese entity linking requires more consideration due to the lack of spacing and capitalization in text sequences and the ambiguity of characters and words,which is more evident in certain *** Chinese domains,such as industry,the generated candidate entities are usually composed of long strings and are heavily *** addition,the meanings of the words that make up industrial entities are sometimes *** semantic space is a subspace of the general word embedding space,and thus each entity word needs to get its exact ***,we propose two schemes to achieve better Chinese entity ***,we implement an ngram based candidate entity generation method to increase the recall rate and reduce the nesting ***,we enhance the corresponding candidate entity ranking mechanism by introducing sense *** the contradiction between the ambiguity of word vectors and the single sense of the industrial domain,we design a sense embedding model based on graph clustering,which adopts an unsupervised approach for word sense induction and learns sense representation in conjunction with *** test the embedding quality of our approach on classical datasets and demonstrate its disambiguation ability in general *** confirm that our method can better learn candidate entities’fundamental laws in the industrial domain and achieve better performance on entity linking through experiments.
data race is one of the most important concurrent anomalies in multi-threaded *** con-straint-based techniques are leveraged into race detection,which is able to find all the races that can be found by any oth-er soun...
详细信息
data race is one of the most important concurrent anomalies in multi-threaded *** con-straint-based techniques are leveraged into race detection,which is able to find all the races that can be found by any oth-er sound race ***,this constraint-based approach has serious limitations on helping programmers analyze and understand data ***,it may report a large number of false positives due to the unrecognized dataflow propa-gation of the ***,it recommends a wide range of thread context switches to schedule the reported race(in-cluding the false one)whenever this race is exposed during the constraint-solving *** ad hoc recommendation imposes too many context switches,which complicates the data race *** address these two limitations in the state-of-the-art constraint-based race detection,this paper proposes DFTracker,an improved constraint-based race detec-tor to recommend each data race with minimal thread context ***,we reduce the false positives by ana-lyzing and tracking the dataflow in the *** this means,DFTracker thus reduces the unnecessary analysis of false race *** further propose a novel algorithm to recommend an effective race schedule with minimal thread con-text switches for each data *** experimental results on the real applications demonstrate that 1)without removing any true data race,DFTracker effectively prunes false positives by 68%in comparison with the state-of-the-art constraint-based race detector;2)DFTracker recommends as low as 2.6-8.3(4.7 on average)thread context switches per data race in the real world,which is 81.6%fewer context switches per data race than the state-of-the-art constraint based race ***,DFTracker can be used as an effective tool to understand the data race for programmers.
作者:
Zhong, WenjieSun, TaoZhou, Jian-TaoWang, ZhuoweiSong, XiaoyuInner Mongolia University
College of Computer Science the Engineering Research Center of Ecological Big Data Ministry of Education the Inner Mongolia Engineering Laboratory for Cloud Computing and Service Software the Inner Mongolia Engineering Laboratory for Big Data Analysis Technology Hohhot010000 China Guangdong University of Technology
School of Computer Science and Technology Guangzhou510006 China Portland State University
Department of Electrical and Computer Engineering PortlandOR97207 United States
Colored Petri nets (CPNs) provide descriptions of the concurrent behaviors for software and hardware. Model checking based on CPNs is an effective method to simulate and verify the concurrent behavior in system design...
详细信息
Insect fine-grained image classification is an application scenario in fine-grained image classification. It not only has the characteristics of small inter-class differences and large intra-class differences, but als...
详细信息
Graph processing has been widely used in many scenarios,from scientific computing to artificial *** processing exhibits irregular computational parallelism and random memory accesses,unlike traditional ***,running gra...
详细信息
Graph processing has been widely used in many scenarios,from scientific computing to artificial *** processing exhibits irregular computational parallelism and random memory accesses,unlike traditional ***,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph *** domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software *** article presents a graph processing ecosystem from hardware to *** start by introducing a series of hardware accelerators as the foundation of this ***,the codesigned parallel graph systems and their distributed techniques are presented to support graph ***,we introduce our efforts on novel graph applications and hardware *** results show that various graph applications can be efficiently accelerated in this graph processing ecosystem.
Graph neural networks(GNNs)have gained traction and have been applied to various graph-based data analysis tasks due to their high ***,a major concern is their robustness,particularly when faced with graph data that h...
详细信息
Graph neural networks(GNNs)have gained traction and have been applied to various graph-based data analysis tasks due to their high ***,a major concern is their robustness,particularly when faced with graph data that has been deliberately or accidentally polluted with *** presents a challenge in learning robust GNNs under noisy *** address this issue,we propose a novel framework called Soft-GNN,which mitigates the influence of label noise by adapting the data utilized in *** approach employs a dynamic data utilization strategy that estimates adaptive weights based on prediction deviation,local deviation,and global *** better utilizing significant training samples and reducing the impact of label noise through dynamic data selection,GNNs are trained to be more *** evaluate the performance,robustness,generality,and complexity of our model on five real-world datasets,and our experimental results demonstrate the superiority of our approach over existing methods.
Computer vision(CV)algorithms have been extensively used for a myriad of applications *** the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processing power...
详细信息
Computer vision(CV)algorithms have been extensively used for a myriad of applications *** the multimedia data are generally well-formatted and regular,it is beneficial to leverage the massive parallel processing power of the underlying platform to improve the performances of CV *** Instruction Multiple data(SIMD)instructions,capable of conducting the same operation on multiple data items in a single instruction,are extensively employed to improve the efficiency of CV *** this paper,we evaluate the power and effectiveness of RISC-V vector extension(RV-V)on typical CV algorithms,such as Gray Scale,Mean Filter,and Edge *** our examinations,we show that compared with the baseline OpenCV implementation using scalar instructions,the equivalent implementations using the RV-V(version 0.8)can reduce the instruction count of the same CV algorithm up to 24x,when processing the same input ***,the actual performances improvement measured by the cycle counts is highly related with the specific implementation of the underlying RV-V *** our evaluation,by using the vector co-processor(with eight execution lanes)of Xuantie C906,vector-version CV algorithms averagely exhibit up to 2.98x performances speedups compared with their scalar counterparts.
The key-value separation is renowned for its significant mitigation of the write amplification inherent in traditional LSM trees. However, KV separation potentially increases performance overhead in the management of ...
详细信息
暂无评论