Recent studies suggest that plant disease identification via computational approaches is vital for agricultural production. However, there is still a large gap that needs to be bridged while the training data is imbal...
详细信息
Adam has become one of the most favored optimizers in deep learning problems. Despite its success in practice, numerous mysteries persist regarding its theoretical understanding. In this paper, we study the implicit b...
Modern recommendation systems are widely used in modern data *** random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they...
详细信息
Modern recommendation systems are widely used in modern data *** random and sparse embedding lookup operations are the main performance bottleneck for processing recommendation systems on traditional platforms as they induce abundant data movements between computing units and ***-based processing-in-memory(PIM)can resolve this problem by processing embedding vectors where they are ***,the embedding table can easily exceed the capacity limit of a monolithic ReRAM-based PIM chip,which induces off-chip accesses that may offset the PIM ***,we deploy the decomposed model on-chip and leverage the high computing efficiency of ReRAM to compensate for the decompression performance *** this paper,we propose ARCHER,a ReRAM-based PIM architecture that implements fully yon-chip recommendations under resource ***,we make a full analysis of the computation pattern and access pattern on the decomposed *** on the computation pattern,we unify the operations of each layer of the decomposed model in multiply-and-accumulate *** on the access observation,we propose a hierarchical mapping schema and a specialized hardware design to maximize resource *** the unified computation and mapping strategy,we can coordinatethe inter-processing elements *** evaluation shows that ARCHER outperforms the state-of-the-art GPU-based DLRM system,the state-of-the-art near-memory processing recommendation system RecNMP,and the ReRAM-based recommendation accelerator REREC by 15.79×,2.21×,and 1.21× in terms of performance and 56.06×,6.45×,and 1.71× in terms of energy savings,respectively.
With the continuous development of the economic level of the society, the game industry is getting more and more extensive development among young people. As one of the mainstreams in the game industry, FPS games are ...
详细信息
The scheduling of final exams at a university is a problem which can be improved with artificial intelligence techniques. In this paper we explain and compare two algorithms used to solve the exam scheduling problem a...
详细信息
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and testing data by adapting a given model w.r.t. any testing sample. This task is particularly important when the test environ...
详细信息
Test-time adaptation (TTA) seeks to tackle potential distribution shifts between training and testing data by adapting a given model w.r.t. any testing sample. This task is particularly important when the test environment changes frequently. Although some recent attempts have been made to handle this task, we still face two key challenges: 1) prior methods have to perform backpropagation for each test sample, resulting in unbearable optimization costs to many applications;2) while existing TTA solutions can significantly improve the test performance on out-of-distribution data, they often suffer from severe performance degradation on in-distribution data after TTA (known as catastrophic forgetting). To this end, we have proposed an Efficient Anti-Forgetting Test-Time Adaptation (EATA) method which develops an active sample selection criterion to identify reliable and non-redundant samples for test-time entropy minimization. To alleviate forgetting, EATA introduces a Fisher regularizer estimated from test samples to constrain important model parameters from drastic changes. However, in EATA, the adopted entropy loss consistently assigns higher confidence to predictions even when the samples are underlying uncertain, leading to overconfident predictions that underestimate the data uncertainty. To tackle this, we further propose EATA with Calibration (EATA-C) to separately exploit the reducible model uncertainty and the inherent data uncertainty for calibrated TTA. Specifically, we compare the divergence between predictions from the full network and its sub-networks to measure the reducible model uncertainty, on which we propose a test-time uncertainty reduction strategy with divergence minimization loss to encourage consistent predictions instead of overconfident ones. To further re-calibrate predicting confidence on different samples, we utilize the disagreement among predicted labels as an indicator of the data uncertainty. Based on this, we devise a min-max entropy
This research investigates the application of deep learning techniques to sentiment analysis, a field focused on mining online platforms for subjective assessments. By combining deep neural networks (DNNs) with partic...
详细信息
Lightweight cryptography aims to design secure and efficient cryptographic algorithms for resource-constrained devices. Traditional cryptographic algorithms may not be readily usable in resource-constrained environmen...
详细信息
Transfer optimization enables data-efficient optimization of a target task by leveraging experiential priors from related source tasks. This is especially useful in multiobjective optimization settings where a set of ...
详细信息
Graphs that are used to model real-world entities with vertices and relationships among entities with edges,have proven to be a powerful tool for describing real-world problems in *** most real-world scenarios,entitie...
详细信息
Graphs that are used to model real-world entities with vertices and relationships among entities with edges,have proven to be a powerful tool for describing real-world problems in *** most real-world scenarios,entities and their relationships are subject to constant *** that record such changes are called dynamic *** recent years,the widespread application scenarios of dynamic graphs have stimulated extensive research on dynamic graph processing systems that continuously ingest graph updates and produce up-to-date graph analytics *** the scale of dynamic graphs becomes larger,higher performance requirements are demanded to dynamic graph processing *** the massive parallel processing power and high memory bandwidth,GPUs become mainstream vehicles to accelerate dynamic graph processing ***-based dynamic graph processing systems mainly address two challenges:maintaining the graph data when updates occur(i.e.,graph updating)and producing analytics results in time(i.e.,graph computing).In this paper,we survey GPU-based dynamic graph processing systems and review their methods on addressing both graph updating and graph *** comprehensively discuss existing dynamic graph processing systems on GPUs,we first introduce the terminologies of dynamic graph processing and then develop a taxonomy to describe the methods employed for graph updating and graph *** addition,we discuss the challenges and future research directions of dynamic graph processing on GPUs.
暂无评论