With the rapid development of the tourism industry, traditional tourism methods are undergoing significant transformation, and online tourism is gradually becoming a new highlight in the market. However, faced with th...
详细信息
ISBN:
(数字)9798331529246
ISBN:
(纸本)9798331529253
With the rapid development of the tourism industry, traditional tourism methods are undergoing significant transformation, and online tourism is gradually becoming a new highlight in the market. However, faced with the challenges of big data, online tourism data experiences tremendous pressure in terms of storage, computation, and management, and the issue of information overload seriously affects the user experience. Therefore, there is an urgent need for innovative platforms and recommendation algorithms to optimize the screening and recommendation of tourism information. This paper proposes an optimization scheme based on the Spark cloud computing platform to improve the efficiency of tourism data processing. First, utilizing Spark's distributed computing and memory management capabilities, tourism data can be stored and processed in a distributed manner, thus accelerating the computation speed of recommendation algorithms. Secondly, a Python-written crawler program is used to obtain real tourism data, including user ratings and comments on attractions, to evaluate the effectiveness of the improved algorithm. Finally, a weighted algorithm combining the LDA topic model is proposed. This algorithm analyzes the topic distribution of comment texts and weights the comments based on user ratings, calculating the predicted score for unrated attractions for the target user, thereby achieving accurate attraction recommendations. The experimental results show that the algorithm proposed in this study demonstrates significant advantages in prediction accuracy.
The industry and academia have proposed many distributed graph processing systems. However, the existing systems are not friendly enough for users like data analysts and algorithm engineers. On the one hand, the progr...
详细信息
Outlier detection on data streams identifies unusual states to sense and alarm potential risks and faults of the target systems in both the cyber and physical world. As different parameter settings of machine learning...
详细信息
Outlier detection on data streams identifies unusual states to sense and alarm potential risks and faults of the target systems in both the cyber and physical world. As different parameter settings of machine learning algorithms can result in dramatically different performance, automatic parameter selection is also of great importance in deploying outlier detection algorithms in data streams. However, current canonical parameter selection methods suffer from two key challenges: (i) Data streams generally evolve over time, but these existing methods use a fixed training set, which fails to handle this evolving environment and often results in suboptimal parameter recommendations; (ii) The stream is infinite, and thus any parameter selection method taking the entire stream as input is infeasible. In light of these limitations, this paper introduces a Dynamic Parameter Selection method for outlier detection on data Streams (DPSS for short). DPSS uses Gaussian process regression to model the relationship between parameters and detecting performance and uses Bayesian optimization to explore the optimal parameter setting. For each new subsequence, DPSS updates the recommended parameter setting to suit the evolving characteristics. Besides, DPSS only uses historical calculations to guide the parameter setting sampling and adjust the Gaussian process regression results. DPSS can be employed as an auxiliary plug-in tool to improve the detection performance of outlier detection methods. Extensive experiments show that our method can significantly improve the F-score of outlier detectors in data streams compared to its counterparts and obtains more superior parameter selection performance than other state-of the-art parameter selection approaches. DPSS also achieves better time and memory efficiency compared to competitors.
An improved Vandermonde decoding PE for distributed storage system is proposed in this paper. The PE first give a decoding algorithm based on matrix partition and Lagrange interpolation method to split the complex mat...
详细信息
Locality-sensitive hashing (LSH) is an established method for fast data indexing and approximate similarity search, with useful parallelism properties. Although indexes and similarity measures are key for data cluster...
详细信息
ISBN:
(纸本)9783031125973;9783031125966
Locality-sensitive hashing (LSH) is an established method for fast data indexing and approximate similarity search, with useful parallelism properties. Although indexes and similarity measures are key for data clustering, little has been investigated on the benefits of LSH in the problem. Our proposition is that LSH can be extremely beneficial for parallelizing high-dimensional density-based clustering e.g., DBSCAN, a versatile method able to detect clusters of different shapes and sizes. We contribute to fill the gap between the advancements in LSH and density-based clustering. We show how approximate DBSCAN clustering can be fused into the process of creating an LSH index, and, through parallelization and fine-grained synchronization, also utilize efficiently available computing capacity. The resulting method, ***, can support a wide range of applications with diverse distance functions, as well as data distributions and dimensionality. We analyse its properties and evaluate our prototype implementation on a 36-core machine with 2-way hyper threading on massive data-sets with various numbers of dimensions. Our results show that *** effectively complements established state-of-the-art methods by up to several orders of magnitude of speed-up on higher dimensional datasets, with tunable high clustering accuracy through LSH parameters.
One of the biggest problems in computer vision is getting a machine to automatically show an image's content along with a phrase in natural language. Using deep learning models, this research demonstrates two sepa...
详细信息
ISBN:
(数字)9798350318609
ISBN:
(纸本)9798350318616
One of the biggest problems in computer vision is getting a machine to automatically show an image's content along with a phrase in natural language. Using deep learning models, this research demonstrates two separate methods to generate picture captions. The first method employs VGG16 for feature extraction followed by training a Long Short-Term Memory (LSTM) model. In the second approach, ResNet50 is utilized for feature extraction, and a combined Convolutional Neural Network-Long Short-Term Memory (CNN-LSTM) model is trained for caption generation. Both VGG16 and ResNet50 are popular convolutional neural networks for extracting characteristics from images, while LSTM and CNN-LSTM are recurrent neural networks suitable for sequential data processing. By comparing these approaches, we evaluate their effectiveness in generating descriptive captions for images. Experimental results indicate the strengths and weaknesses of each method, providing insights into the interplay between feature extraction and captioning models in image understanding tasks.
Effective load balancing is essential for ensuring the efficient operation of multi-core systems, particularly in environments with diverse task priorities and deadlines. The primary objective is to intelligently dist...
详细信息
ISBN:
(数字)9798331521349
ISBN:
(纸本)9798331521356
Effective load balancing is essential for ensuring the efficient operation of multi-core systems, particularly in environments with diverse task priorities and deadlines. The primary objective is to intelligently distribute tasks among cores to optimize system responsiveness, prioritizing the timely completion of urgent tasks. Through rigorous computer simulations, this paper explores various load-balancing techniques tailored for multi-core systems operating in mixed real-time environments. Notably, the focus is on strategies applicable to such systems, excluding hardware-level optimizations and specific real-time scheduling algorithms. The paper introduces an algorithm aimed at enhancing both load balancing and response time. This involves leveraging the LBPSA (Load Balancing-based Partitioned Scheduling Algorithm) for load balancing and implementing the TBS (Total Bandwidth Server) migration method to further improve response time. The insights and findings presented offer valuable guidance for researchers, system designers, and practitioners involved in load balancing within mixed real-time multi-core systems across a broad spectrum of applications. The LBPSA and TBS methods are presented as contributions of the paper, enhancing the existing body of knowledge in load-balancing techniques for multi-core systems.
Satellite imagery is often composed of diverse terrains like forest, desert, snow and exhibits haze, fog, thin clouds which require dehazing in order to make them analysisready. Onboard processing of satellite imagery...
详细信息
Satellite imagery is often composed of diverse terrains like forest, desert, snow and exhibits haze, fog, thin clouds which require dehazing in order to make them analysisready. Onboard processing of satellite imagery requires the algorithm's parameters to be fine-tuned depending on the type of terrain encountered. From the atmospheric light scattering model, the estimation of atmospheric light and transmission map is performed in single image dehazing method. This paper focuses on tuning an existing method the “Efficient image Dehazing with Boundary Constraints and Contextual Regularization method for satellite imagery”. A new image quality assessment method is introduced to enable fine-tuning the exponent of the algorithm. With the onset of onboard processing requirements, parallel implementation and faster imageprocessingmethods are explored for small run-times.
The current computer situation is that people generally have home computers. With the rise and prosperity of the Internet era such as short video, people have higher and higher requirements for high-definition status ...
The current computer situation is that people generally have home computers. With the rise and prosperity of the Internet era such as short video, people have higher and higher requirements for high-definition status of pictures and videos. In addition, some people hope to use Internet technology to repair and clarify ancient photos and sports videos. Therefore, in order to make the motion blurred image clearer, this paper studies the image restoration method with the help of feature fusion and particle swarm optimization algorithm. In this paper, we mainly use analytic hierarchy process (AHP) and experimental case analysis and comparison to restore blurred images in different motion states. The experimental results show that the maximum time difference of the proposed method for processing motion blurred images can be 13.1s, which is much smaller than the other two methods.
With the continuous improvement on model performance, deep learning models have been widely deployed and achieved promising outcomes in various fields in recent years. However, due to the escalating volumes of trainin...
With the continuous improvement on model performance, deep learning models have been widely deployed and achieved promising outcomes in various fields in recent years. However, due to the escalating volumes of training data and the complexity of application problems, it becomes more and more challenging to design a neural network with better performance by hand. Analysing the evolution of typical neural network structures has important reference significance for designing a network structure. In this paper, we select the open source models in SAR imageprocessing for an empirical analysis on the evolution of neural network structures. We analyse the evolution of 239 open source deep learning models from the aspects of framework, computing unit, model computation amount and the combined use of various computing units. Results reveal that preference and co-occurrence exist in computing units, while the average number of convolution, activation and normalization layer increases significantly over time. Model complexity shows an overall upward trend, and the characteristics of SAR image are more and more taken into consideration during the model structure design.
暂无评论