Today's deep learning models face an increasing demand to handle dynamic shape tensors and computation whose shape information remains unknown at compile time and varies in a nearly infinite range at runtime. This...
详细信息
Today's deep learning models face an increasing demand to handle dynamic shape tensors and computation whose shape information remains unknown at compile time and varies in a nearly infinite range at runtime. This shape dynamism brings tremendous challenges for existing compilation pipelines designed for static models which optimize tensor programs relying on exact shape values. This paper presents TSCompiler, an end-to-end compilation framework for dynamic shape models. TSCompiler first proposes a symbolic shape propagation algorithm to recover symbolic shape information at compile time to enable subsequent optimizations. TSCompiler then partitions the shape-annotated computation graph into multiple subgraphs and fine-tunes the backbone operators from the subgraph within a hardware-aligned search space to find a collection of high-performance schedules. TSCompiler can propagate the explored backbone schedule to other fusion groups within the same subgraph to generate a set of parameterized tensor programs for fused cases based on dependence analysis. At runtime, TSCompiler utilizes an occupancy-targeted cost model to select from pre-compiled tensor programs for varied tensor shapes. Extensive evaluations show that TSCompiler can achieve state-of-the-art speedups for dynamic shape models. For example, we can improve kernel efficiency by up to 3.97× on NVIDIA RTX3090, and 10.30× on NVIDIA A100 and achieve up to five orders of magnitude speedups on end-to-end latency.
As a complex hot problem in the financial field,stock trend forecasting uses a large amount of data and many related indicators;hence it is difficult to obtain sustainable and effective results only by relying on empi...
详细信息
As a complex hot problem in the financial field,stock trend forecasting uses a large amount of data and many related indicators;hence it is difficult to obtain sustainable and effective results only by relying on empirical *** in the field of machine learning have proved that random forest can form better judgements on this kind of problem,and it has an auxiliary role in the prediction of stock *** study uses historical trading data of four listed companies in the USA stock market,and the purpose of this study is to improve the performance of random forest model in medium-and long-term stock trend *** study applies the exponential smoothing method to process the initial data,calculates the relevant technical indicators as the characteristics to be selected,and proposes the D-RF-RS method to optimize random *** the random forest is an ensemble learning model and is closely related to decision tree,D-RF-RS method uses a decision tree to screen the importance of features,and obtains the effective strong feature set of the model as ***,the parameter combination of the model is optimized through random parameter *** experimental results show that the average accuracy of random forest is increased by 0.17 after the above process optimization,which is 0.18 higher than the average accuracy of light gradient boosting machine *** with the performance of the ROC curve and Precision–Recall curve,the stability of the model is also guaranteed,which further demonstrates the advantages of random forest in medium-and long-term trend prediction of the stock market.
Blockchain technology has the characteristics of non-tampering and forgery, traceability, and so on, which have good application advantages for the storage of multimedia data. So we propose a novel method using matrix...
详细信息
Task scheduling, which is important in cloud computing, is one of the most challenging issues in this area. Hence, an efficient and reliable task scheduling approach is needed to produce more efficient resource employ...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in...
详细信息
The proliferation of cooking videos on the internet these days necessitates the conversion of these lengthy video contents into concise text recipes. Many online platforms now have a large number of cooking videos, in which, there is a challenge for viewers to extract comprehensive recipes from lengthy visual content. Effective summary is necessary in order to translate the abundance of culinary knowledge found in videos into text recipes that are easy to read and follow. This will make the cooking process easier for individuals who are searching for precise step by step cooking instructions. Such a system satisfies the needs of a broad spectrum of learners while also improving accessibility and user simplicity. As there is a growing need for easy-to-follow recipes made from cooking videos, researchers are looking on the process of automated summarization using advanced techniques. One such approach is presented in our work, which combines simple image-based models, audio processing, and GPT-based models to create a system that makes it easier to turn long culinary videos into in-depth recipe texts. A systematic workflow is adopted in order to achieve the objective. Initially, Focus is given for frame summary generation which employs a combination of two convolutional neural networks and a GPT-based model. A pre-trained CNN model called Inception-V3 is fine-tuned with food image dataset for dish recognition and another custom-made CNN is built with ingredient images for ingredient recognition. Then a GPT based model is used to combine the results produced by the two CNN models which will give us the frame summary in the desired format. Subsequently, Audio summary generation is tackled by performing Speech-to-text functionality in python. A GPT-based model is then used to generate a summary of the resulting textual representation of audio in our desired format. Finally, to refine the summaries obtained from visual and auditory content, Another GPT-based model is used
Iris biometrics allow contactless authentication, which makes it widely deployed human recognition mechanisms since the couple of years. Susceptibility of iris identification systems remains a challenging task due to ...
详细信息
In the realm of medical diagnostics, particularly in differential diagnosis, where differentiating between illnesses or ailments with comparable symptoms is essential, deep learning has gained importance. Recent devel...
详细信息
Interpretable visual recognition is essential for decision-making in high-stakes situations. Recent advancements have automated the construction of interpretable models by leveraging Visual Language Models (VLMs) and ...
详细信息
The self-cascade(SC) method is an effective technique for chaos enhancement and complexity increasing in chaos ***, the controllable self-cascade(CSC) method allows for more accurate control of Lyapunov exponents of t...
详细信息
The self-cascade(SC) method is an effective technique for chaos enhancement and complexity increasing in chaos ***, the controllable self-cascade(CSC) method allows for more accurate control of Lyapunov exponents of the discrete map. In this work, the SC and CSC systems of the original map are derived, which enhance the chaotic performance while preserving the fundamental dynamical characteristics of the original map. Higher Lyapunov exponent of chaotic sequences corresponding to higher frequency are obtained in SC and CSC systems. Meanwhile, the Lyapunov exponent could be linearly controlled with greater flexibility in the CSC system. The verification of the numerical simulation and theoretical analysis is carried out based on the platform of CH32.
To address the problem of inaccurate prediction of slab quality in continuous casting, an algorithm based on particle swarm optimisation and differential evolution is proposed. The algorithm combines BP neural network...
详细信息
暂无评论