The effective management of trajectory data heavily relies on the utilization of fundamental spatio-temporal queries. The surge in trajectory data, with its dynamic spatio-temporal properties, poses notable management...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
The effective management of trajectory data heavily relies on the utilization of fundamental spatio-temporal queries. The surge in trajectory data, with its dynamic spatio-temporal properties, poses notable management challenges. Existing systems are inadequate in providing fine-grained trajectory representations and efficient architecture for processing queries, leading to significant computational overhead. This paper introduces TMan to address these challenges. First, TMan presents two innovative index structures that precisely capture the spatio-temporal characteristics of trajectory data. Compared to the state-of-the-art indexes, our indexes for temporal range and spatial range queries can reduce the number of retrievals by up to 77% and 83%, respectively. Next, TMan devises concise and effective encoding methods for these indexes. Leveraging these indexes, TMan provides a distributed storage structure and an index caching mechanism for efficiently managing trajectories in key-value data stores. Moreover, TMan introduces a parallel query processing approach incorporating a push-down strategy to enhance the efficiency of fundamental queries. Extensive experimental results demonstrate that TMan's index structures and architecture outperform the baselines.
The Super-Resolution (SR) task is to generate high-resolution (HR) images/videos using low-resolution (LR) ones. At present, the Single image Super-Resolution (SISR) methods have achieved superior performance. However...
详细信息
ISBN:
(数字)9798350386660
ISBN:
(纸本)9798350386677
The Super-Resolution (SR) task is to generate high-resolution (HR) images/videos using low-resolution (LR) ones. At present, the Single image Super-Resolution (SISR) methods have achieved superior performance. However, Video Super-Resolution (VSR) methods still have some disadvantages such as edge blurring, high-frequency details missing and motion artifacts. This paper proposes an end-to-end HR feature projection VSR network HOF-HRPN, which consists of a High-Resolution Optical Flow (HOF) and a High-Resolution Projection Network (HRPN). The HOF network estimates the HR optical flow and compensates it to the neighboring frames to achieve accurate frame alignment. The HRPN is composed of a multi frame feature projection channel and a single frame SR channel in parallel. HRPN takes into account the advantages that single frame SR and VSR can extract missing high-frequency details from intra-frame and inter-frame, respectively. It fuses high-frequency details obtained from multi-scale LR projection learning and single frame SR results. A large number of comparative experiments based on public datasets verify that HOF-HRPN is robust and can recover accurate pixel values, clear edges and rich textures.
Data compression plays a key role in efficient data storage, transmission, and processing. With the fast development of deep learning techniques, deep neural networks have been used in this field to achieve a higher c...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Data compression plays a key role in efficient data storage, transmission, and processing. With the fast development of deep learning techniques, deep neural networks have been used in this field to achieve a higher compression rate. Deep learning-based general-purpose lossless compression techniques are formulated as an autoregressive sequential prediction problem. These methods are state-of-the-art in terms of compression ratio but not practical due to runtime and resource constraints. Recent advances in lossless image compression using non-autoregressive methods for probability modeling prove to be a faster and more practical approach. In this paper, we propose ByteZip, a lossless compression method based on the non-autoregressive approach for known or defined structured byte streams. ByteZip involves hierarchical probabilistic modeling using autoencoders and density mixture models. This approach reduces the overhead of sequential processing. The goal is to design a practical lossless compressor with faster compression and decompression along with a competitive compression ratio. Experiments show that the proposed approach achieves a 64× higher compression speed than the state-of-the-art transformer-based model TRACE with an overhead of only 5% less size reduction on average. Our approach outperforms general-purpose compressors such as Gzip (23% more size reduction on average) and 7z (16% more size reduction on average).
k-anonymity [1], [2], aims to ensure that individual data cannot be distinguished from that of at least (k−1) others in the same database, regardless of additional knowledge. However, this process involves modifying d...
详细信息
ISBN:
(数字)9798350362480
ISBN:
(纸本)9798350362497
k-anonymity [1], [2], aims to ensure that individual data cannot be distinguished from that of at least (k−1) others in the same database, regardless of additional knowledge. However, this process involves modifying data, resulting in information loss (IL). Finding the optimal solution to minimize this loss is NP-Hard [3], [4], prompting the development of heuristics. While some of these solutions are quadratic, they are impractical for large databases, leading to proposals for distributed memory environments. In these environments, some methods utilize horizontal partitioning of the database to speed up execution time, with each processor handling multiple records simultaneously. However, as the number of processors increases, the size of subsets handled by each processor decreases, leading to increased information loss, especially as k increases. This paper addresses the problem of minimizing information loss when anonymizing large databases. We propose an approach that exploits parallelism for cluster-computing using horizontal partitioning with overlaps (i.e., where partitions have common rows) to enhance information loss for the anonymization of databases. After anonymization, anonymous subsets are aggregated to create a global anonymized database by removing duplicate records. Our proposed algorithm employs parallel hierarchical aggregation that chooses a better version of an anonymized record among the different versions where it overlaps. Experimental results show that our approach is approximately 80× faster and incurs less information loss than the centralized GkAA [5], [6] algorithm.
In order to satisfy the intelligent requirements of industrial systems and assist in automatic recognition of cutter wear, this paper proposes an image-based automatic detectionmethod for cutter ring edge wear of shie...
详细信息
ISBN:
(纸本)9789811945465;9789811945458
In order to satisfy the intelligent requirements of industrial systems and assist in automatic recognition of cutter wear, this paper proposes an image-based automatic detectionmethod for cutter ring edge wear of shieldmachine. The paper mainly studies: (1) Preprocess the original cutter images, the pixel image is generated by graying and thresholding methods, using the gray characteristics to suppress background, it has only two gray values;(2) Based on DBSCAN clustering algorithm, the optimization of cutter ring edge clusters is realized, and the edge pixel clusters are retained;(3) An ring edge extraction method based on structural constraints is proposed, the internal pixels are removed by orthogonal bidirectional projection, we obtained the preliminary image edge extraction results;(4) The circular edge of cutter-image is obtained by remaining pixels polynomial fitting based on polar coordinates. Finally, through the reference the actual size of cutter, the actual radius error is less than 3%. The experimental results show that this method can automatically and accurately detect the actual cutter wear of shield machine, and it provides an effective solution for the intelligent detection of cutter wear.
Nowadays, deep learning has been widely used for solving natural language processing (NLP) problems. Embedding matrices are common-used in the NLP deep models for automatical feature learning. However, the sparsity of...
详细信息
ISBN:
(纸本)9781450388634
Nowadays, deep learning has been widely used for solving natural language processing (NLP) problems. Embedding matrices are common-used in the NLP deep models for automatical feature learning. However, the sparsity of embedding matrices makes it challenging to efficiently train the NLP models in data parallelism. When training with synchronous optimization methods, the aggregation on sparse gradients brings high communication cost and low scalability for distributed training. In this paper, we combine Model Average (MA) and synchronous optimization methods together, and propose HMA, a hybrid training method for NLP deep models. Furthermore, we implement HMA method in Horovod+TensorFlow training framework and conduct experimental evaluation with representative NLP models. For NLP models with a large number of sparse parameters, HMA saves over 30% wall-clock time compared with the state-of-the-art distributed training framework, while maintaining the same final training loss.
Most supercomputers adopt a data forwarding architecture to achieve storage scalability. However, it results in a significant reduction in single-process bandwidth compared to direct file system access. Moreover, cons...
详细信息
Most supercomputers adopt a data forwarding architecture to achieve storage scalability. However, it results in a significant reduction in single-process bandwidth compared to direct file system access. Moreover, considering that a majority of applications uses only a single process for writing and reading data, the low single-process performance also leads to a time overhead for these applications. This paper proposes an userspace forwarding mechanism DFBUFFER with two performance optimization methods: user-space multi-thread request processing and data write buffer in a unit of file. The client of DFBUFFER is embedded in the application as a library reducing the software overhead, and the server implements multi-thread I/O request processing to improve bandwidth efficiency. The data write buffer can asynchronously handle write requests, which accelerates the write bandwidth of compute nodes. We evaluate DFBUFFER on the Sunway exascale prototype system. The results indicate that in the regular mode of DFBUFFER, both the write and read latency are reduced, and the write bandwidth and large-block read bandwidth of single-process are increased by 1.8 times and 2.8 times respectively. The DFBUFFER buffer mode increases the write bandwidth of a single process by 0.8 times over the regular mode. Although the performance advantage of the regular mode of DFBUFFER gradually weakens with the increase of concurrent processes, the DFBUFFER buffer mode has the effect of improving the write bandwidth, the 64-IO-processes application is increased by 0.2 times.
With the development of the times and the progress of society, the number of high-rise buildings is increasing, and the cleaning of the outer glass at height is extremely difficult, relying on manual cleaning with hig...
With the development of the times and the progress of society, the number of high-rise buildings is increasing, and the cleaning of the outer glass at height is extremely difficult, relying on manual cleaning with high safety risks and low efficiency. The trend is to use automated control products such as cleaning robots to take on some of the difficult cleaning work. Path planning and image recognition for glass cleaning robots is a hot issue for cleaning robots to achieve efficient cleaning. In this paper, a path planning and image recognition method for glass cleaning robots based on hybrid path planning and convolutional neural networks is proposed to address this hot problem. The cleaning robot adopts a hybrid path of contour-parallel path and method-parallel path to achieve full-coverage path planning, and uses a portable camera to take photos of the glass, and uses a convolutional neural network model to determine the degree of dirty glass, and uses this as the basis for different cleaning processes, effectively improving the efficiency of cleaning and the utilisation of resources. In this paper, we design an intelligent control algorithm for a glass cleaning robot that combines artificial intelligence and 3D printing path planning generation by combining two of these methods, which helps to improve the cleaning effect of the cleaning robot.
The Federated Averaging (FedAvg) algorithm, which consists of alternating between a few local stochastic gradient updates at client nodes, followed by a model averaging update at the server, is perhaps the most common...
详细信息
ISBN:
(纸本)9781713871088
The Federated Averaging (FedAvg) algorithm, which consists of alternating between a few local stochastic gradient updates at client nodes, followed by a model averaging update at the server, is perhaps the most commonly used method in Federated Learning. Notwithstanding its simplicity, several empirical studies have illustrated that the model output by FedAvg leads to a model that generalizes well to new unseen tasks after a few fine-tuning steps. This surprising performance of such a simple method, however, is not fully understood from a theoretical point of view. In this paper, we formally investigate this phenomenon in the multi-task linear regression setting. We show that the reason behind the generalizability of the FedAvg output is FedAvg's power in learning the common data representation among the clients' tasks, by leveraging the diversity among client data distributions via multiple local updates between communication rounds. We formally establish the iteration complexity required by the clients for proving such result in the setting where the underlying shared representation is a linear map. To the best of our knowledge, this is the first result showing that FedAvg learns an expressive representation in any setting. Moreover, we show that multiple local updates between communication rounds are necessary for representation learning, as distributed gradient methods that make only one local update between rounds provably cannot recover the ground-truth representation in the linear setting, and empirically yield neural network representations that generalize drastically worse to new clients than those learned by FedAvg trained on heterogeneous image classification datasets.
The Memetic Algorithm (MA), introduced by Pablo Moscato in 1989, integrates Evolutionary Algorithms with local search methods, enhancing its effectiveness in solving complex optimization problems. This paper provides ...
详细信息
ISBN:
(数字)9798350367492
ISBN:
(纸本)9798350367508
The Memetic Algorithm (MA), introduced by Pablo Moscato in 1989, integrates Evolutionary Algorithms with local search methods, enhancing its effectiveness in solving complex optimization problems. This paper provides a comprehensive survey of MA research published in 2019, reviewing 75 selected papers from an initial pool of 112 identified through Google Scholar. The selected papers were categorized into five types: optimization problems (40 papers), imageprocessing (10 papers), parallelprocessing (5 papers), gene/DNA datasets (4 papers), and other applications (16 papers). The survey highlights MA’s versatility and effectiveness across various domains, particularly its potential for solving complex optimization problems. Key findings include the adaptability of MA for diverse applications, its ongoing relevance in addressing challenging issues, and promising opportunities for combining MA with other algorithms to enhance performance. The paper also emphasizes the significance of MA in fields such as imageprocessing, where it improves pattern recognition and image enhancement, and in bioinformatics, where it optimizes gene selection and genetic algorithms. Despite the extensive study of MA, there remains a significant research gap in non-English literature, particularly in Bahasa, limiting accessibility for Indonesian researchers. This survey aims to bridge this gap by providing valuable insights and encouraging further exploration and application of MA to solve increasingly complex problems. It offers a comprehensive overview that underscores the importance of MA and its potential for future research and innovation.
暂无评论