The Dirichlet process mixture model (DPMM), one of the nonparametric Bayesian mixture models, is receiving more and more attentions from the statistical learning community. It has been demonstrated its great potential...
详细信息
ISBN:
(纸本)9781728108582
The Dirichlet process mixture model (DPMM), one of the nonparametric Bayesian mixture models, is receiving more and more attentions from the statistical learning community. It has been demonstrated its great potentials in clustering analysis. When computational complexity increases as numbers of observations and features grow, the serial algorithms of DPMM need long processing time and cannot handle large volume of data on a single machine. To improve the computational efficiency, several parallelmethods and implementations were proposed and implemented with C++ and Julia programming languages by different authors and publicly available on GitHub or published as a Julia package for users to download. However, the scalability of multi-cores and multi-node has not been thoroughly evaluated and compared among different implementations, even for multiple implementations of the same proposed distributed PPMM method. We selected two recent Julia implementations of parallel sampler via sub-cluster splits method proposed by Change and Fisher and performed a scalability comparison on supercomputer clusters. This paper presents some insights on the applicability of both implementations in terms of increasing number of dimensions of the feature space and provides some potential improvement strategies on multi-node scalability.
Multi-view shape descriptors obtained from various 2D images are commonly adopted in 3D shape retrieval. One major challenge is that significant shape information is discarded during 2D view rendering through projecti...
详细信息
ISBN:
(纸本)9781728148038
Multi-view shape descriptors obtained from various 2D images are commonly adopted in 3D shape retrieval. One major challenge is that significant shape information is discarded during 2D view rendering through projection. In this paper, we propose a convolutional neural network based method, Neighbor-Center Enhanced Network, to enhance each 2D view using its neighboring ones. By exploiting cross-view correlations, Neighbor-Center Enhanced Network learns how adjacent views can be maximally incorporated for an enhanced 2D representation to effectively describe shapes. We observe that a very small amount of, e.g., six, enhanced 2D views, are already sufficient for panoramic shape description. Thus, by simply aggregating features from six enhanced 2D views, we arrive at a highly compact yet discriminative shape descriptor. The proposed shape descriptor significantly outperforms state-of-the-art 3D shape retrieval methods on the ModelNet and ShapeNet-Core55 benchmarks, and also exhibits robustness against object occlusion.
As big data, medical digitalization, and wearable devices continue to evolve, these technologies are driving the advancement of clinical medicine, genomics, and wearable health while also posing a risk of privacy brea...
详细信息
Web based 3D medical data computing and visual synchronization can provide clinical users with critical information for distributed diagnosis and treatment. However, due to the limited internet bandwidth, high computa...
详细信息
An energy efficient memory-centric convolutional neural network (CNN) processor architecture is proposed for smart devices such as wearable devices or internet of things (IoT) devices. To achieve energy-efficient proc...
详细信息
ISBN:
(纸本)9781538678848
An energy efficient memory-centric convolutional neural network (CNN) processor architecture is proposed for smart devices such as wearable devices or internet of things (IoT) devices. To achieve energy-efficient processing, it has 2 key features: First, 1-D shift convolution PEs with fully distributed memory architecture achieve 3.1TOPS/W energy efficiency. Compared with conventional architecture, even though it has massively parallel 1024 MAC units, it achieve high energy efficiency by scaling down voltage to 0.46V due to its fully local routed design. Next, fully configurable 2-D mesh core-to-core interconnection support various size of input features to maximize utilization. The proposed architecture is evaluated 16mm(2) chip which is fabricated with 65nm CMOS process and it performs real-time face recognition with only 9.4mW at 10MHz and 0.48V.
image segmentation plays a very important role in three-dimensional volume data processing. In three-dimensional segmentation, the curve evolution method based on geometric active contour model and level set method is...
详细信息
Today is the era of information explosion, efficient methods for massive data processing are becoming a hotspot. MapReduce is a popular parallel programming model and widely used in large-scale data parallel computing...
详细信息
Federated learning (FL) refers to the learning paradigm that trains machine learning models directly in the decentralized systems consisting of smart edge devices without transmitting the raw data, which avoids the he...
详细信息
ISBN:
(数字)9781728163956
ISBN:
(纸本)9781728163963
Federated learning (FL) refers to the learning paradigm that trains machine learning models directly in the decentralized systems consisting of smart edge devices without transmitting the raw data, which avoids the heavy communication costs and privacy concerns. Given the typical heterogeneous data distributions in such situations, the popular FL algorithm Federated Averaging (FedAvg) suffers from weight divergence and thus cannot achieve a competitive performance for the global model (denoted as the initial performance in FL) compared to centralized methods. In this paper, we propose the local continual training strategy to address this problem. Importance weights are evaluated on a small proxy dataset on the central server and then used to constrain the local training. With this additional term, we alleviate the weight divergence and continually integrate the knowledge on different local clients into the global model, which ensures a better generalization ability. Experiments on various FL settings demonstrate that our method significantly improves the initial performance of federated models with few extra communication costs.
Temperature is one of the major ecological factors that affect the safe storage of grain. In this paper, we propose a deep spatio-temporal attention mode to predict stored grain temperature, which exploits the histori...
详细信息
ISBN:
(数字)9781728190747
ISBN:
(纸本)9781728183824
Temperature is one of the major ecological factors that affect the safe storage of grain. In this paper, we propose a deep spatio-temporal attention mode to predict stored grain temperature, which exploits the historical temperature data of stored grain and the meteorological data of the region. In this proposed model, we use the Sobel operator to extract the local spatial factors, and leverage the attention mechanism to obtain the global spatial factors of grain temperature data and temporal information. In addition, a convolutional neural network (CNN) is used to learn features of external meteorological factors. Finally, the spatial factors of grain pile and external meteorological factors are combined to predict future grain temperature using long short-term memory (LSTM) based encoder and decoder models. Experiment results show that the proposed model achieves higher predication accuracy compared with the traditional methods.
Convolutional neural networks (CNNs) have gained global recognition in advancing the field of artificial intelligence and have had great successes in a wide array of applications including computer vision, speech and ...
详细信息
ISBN:
(纸本)9781728138855
Convolutional neural networks (CNNs) have gained global recognition in advancing the field of artificial intelligence and have had great successes in a wide array of applications including computer vision, speech and natural language processing. However, due to the rise of big data and increased complexity of tasks, the efficiency of training CNNs have been severely impacted. To achieve state-of-art results, CNNs require tens to hundreds of millions of parameters that need to be fine-tuned, resulting in extensive training time and high computational cost. To overcome these obstacles, this work takes advantage of distributed frameworks and cloud computing to develop a parallel CNN algorithm. Close examination of the implementation of MapReduce based CNNs as well as how the proposed algorithm accelerates learning are discussed and demonstrated through experiments. Results reveal high accuracy in classification and improvements in speedup, scaleup and sizeup compared to the standard algorithm.
暂无评论