Today's deep learning models face an increasing demand to handle dynamic shape tensors and computation whose shape information remains unknown at compile time and varies in a nearly infinite range at runtime. This...
详细信息
Today's deep learning models face an increasing demand to handle dynamic shape tensors and computation whose shape information remains unknown at compile time and varies in a nearly infinite range at runtime. This shape dynamism brings tremendous challenges for existing compilation pipelines designed for static models which optimize tensor programs relying on exact shape values. This paper presents TSCompiler, an end-to-end compilation framework for dynamic shape models. TSCompiler first proposes a symbolic shape propagation algorithm to recover symbolic shape information at compile time to enable subsequent optimizations. TSCompiler then partitions the shape-annotated computation graph into multiple subgraphs and fine-tunes the backbone operators from the subgraph within a hardware-aligned search space to find a collection of high-performance schedules. TSCompiler can propagate the explored backbone schedule to other fusion groups within the same subgraph to generate a set of parameterized tensor programs for fused cases based on dependence analysis. At runtime, TSCompiler utilizes an occupancy-targeted cost model to select from pre-compiled tensor programs for varied tensor shapes. Extensive evaluations show that TSCompiler can achieve state-of-the-art speedups for dynamic shape models. For example, we can improve kernel efficiency by up to 3.97× on NVIDIA RTX3090, and 10.30× on NVIDIA A100 and achieve up to five orders of magnitude speedups on end-to-end latency.
Anomalies in packet length sequences caused by network topology structure and congestion greatly impact the performance of early network traffic classification. Additionally, insufficient differentiation of packet len...
详细信息
In multi-label learning(MLL), it is extremely challenging to accurately annotate every appearing object due to expensive costs and limited knowledge. When facing such a challenge, a more practical and cheaper alternat...
详细信息
In multi-label learning(MLL), it is extremely challenging to accurately annotate every appearing object due to expensive costs and limited knowledge. When facing such a challenge, a more practical and cheaper alternative should be single positive multi-label learning(SPMLL), where only one positive label needs to be provided per sample. Existing SPMLL methods usually assume unknown labels as negatives, which inevitably introduces false negatives as noisy labels. More seriously, binary cross entropy(BCE) loss is often used for training, which is notoriously not robust to noisy labels. To mitigate this issue, we customize an objective function for SPMLL by pushing only one pair of labels apart each time to suppress the domination of negative labels, which is the main culprit of fitting noisy labels in SPMLL. To further combat such noisy labels, we explore the high-rankness of the label matrix, which can also push apart different labels. By directly extending from SPMLL to MLL with full labels, a unified loss applicable to both settings is derived. As a byproduct, the proposed loss can alleviate the imbalance inherent in MLL. Experiments on real datasets demonstrate that the proposed loss not only performs more robustly to noisy labels for SPMLL but also works well for full labels. Besides, we empirically discover that high-rankness can mitigate the dramatic performance drop in SPMLL. Most surprisingly, even without any regularization or fine-tuned label correction, only adopting our loss defeats state-of-the-art SPMLL methods on CUB, a dataset that severely lacks labels.
In this study, the event-triggered asymptotic tracking control problem is considered for a class of nonholonomic systems in chained form for the time-varying reference input. First, to eliminate the ripple phenomenon ...
详细信息
In this study, the event-triggered asymptotic tracking control problem is considered for a class of nonholonomic systems in chained form for the time-varying reference input. First, to eliminate the ripple phenomenon caused by the imprecise compensation of the time-varying reference input, a novel time-varying event-triggered piecewise continuous control law and a triggering mechanism with a time-varying triggering function are developed. Second, an explicit integral input-to-state stable Lyapunov function is constructed for the time-varying closed-loop system regarding the sampling error as the external input. The origin of the closed-loop system is shown to be uniformly globally asymptotically stable for any global exponential decaying threshold signals, which in turn rules out the Zeno behavior. Moreover, infinitely fast sampling can be avoided by appropriately tuning the exponential convergence rate of the threshold signal. A numerical simulation example is provided to illustrate the proposed control approach.
Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various ***,many questions are usually not answered quickly *** the questioners are eager to kn...
详细信息
Stack Overflow provides a platform for developers to seek suitable solutions by asking questions and receiving answers on various ***,many questions are usually not answered quickly *** the questioners are eager to know the specific time interval at which a question can be answered,it becomes an important task for Stack Overflow to feedback the answer time to the *** address this issue,we propose a model for predicting the answer time of questions,named Predicting Answer Time(i.e.,PAT model),which consists of two parts:a feature acquisition and fusion model,and a deep neural network *** framework uses a variety of features mined from questions in Stack Overflow,including the question description,question title,question tags,the creation time of the question,and other temporal *** features are fused and fed into the deep neural network to predict the answer time of the *** a case study,post data from Stack Overflow are used to assess the *** use traditional regression algorithms as the baselines,such as Linear Regression,K-Nearest Neighbors Regression,Support Vector Regression,Multilayer Perceptron Regression,and Random Forest *** results show that the PAT model can predict the answer time of questions more accurately than traditional regression algorithms,and shorten the error of the predicted answer time by nearly 10 hours.
Spatial databases store objects with their locations and certain types of attached items.A variety of modern applications have been developed by leveraging the utilization of locations and items in spatial objects,suc...
详细信息
Spatial databases store objects with their locations and certain types of attached items.A variety of modern applications have been developed by leveraging the utilization of locations and items in spatial objects,such as searching points of interest,hot topics,or users’attitude in specified spatial *** many scenarios,the high and low-frequency items in a spatial region are worth noticing,considering they represent the majority’s interest or eccentric users’***,existing works have yet to identify such items in an interactive manner,despite the significance of the endeavor in decision-making *** study recognizes a novel type of analytical query,called top/bottom-k fraction query,to discover such items in spatial *** achieve fast query response,we propose a multilayered data summary that is spread out across the main memory and external memory.A memory-based estimation method for top/bottom-k fraction queries is *** maximize the use of the main memory space,we design a data summary tuning method to dynamically allocate memory space among different spatial *** proposed approach is evaluated with real-life datasets and synthetic datasets in terms of estimation *** results demonstrate the effectiveness of the proposed data summary and corresponding estimation and tuning algorithms.
The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected *** complexity of network st...
详细信息
The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected *** complexity of network structure,diversity of network nodes,and sparsity of data all pose difficulties in predicting *** paper proposes a malware propagation prediction model based on representation learning and Graph Convolutional Networks(GCN)to address the aforementioned ***,to solve the problem of the inaccuracy of infection intensity calculation caused by the sparsity of node interaction behavior data in the malware propagation network,a mechanism based on a tensor to mine the infection intensity among nodes is proposed to retain the network structure *** influence of the relationship between nodes on the infection intensity is also ***,given the diversity and complexity of the content and structure of infected and normal nodes in the network,considering the advantages of representation learning in data feature extraction,the corresponding representation learning method is adopted for the characteristics of infection intensity among *** can efficiently calculate the relationship between entities and relationships in low dimensional space to achieve the goal of low dimensional,dense,and real-valued representation learning for the characteristics of propagation spatial *** also design a new method,Tensor2vec,to learn the potential structural features of malware ***,considering the convolution ability of GCN for non-Euclidean data,we propose a dynamic prediction model of malware propagation based on representation learning and GCN to solve the time effectiveness problem of the malware propagation *** experimental results show that the proposed model can effectively predict the behaviors of the nodes in the network and discover the influence of different characteristics of nodes on the mal
Augmented reality superimposes digital information onto objects in the physical world and enables multi-user *** that previous proxemic interaction research has explored many applications of user-object distance and u...
详细信息
Augmented reality superimposes digital information onto objects in the physical world and enables multi-user *** that previous proxemic interaction research has explored many applications of user-object distance and user-user distance in an augmented reality context,respectively,and combining both types of distance can improve the efficiency of users’perception and interaction with task objects and collaborators by providing userswith insight into spatial relations of user-task object and user-user,less is concerned about howthe two types of distances can be simultaneously adopted to assist collaboration tasks *** fulfill the gap,we present UOUU,the user-object distance and user-user distance combined method for dynamically assigning tasks across *** conducted empirical studies to investigate how the method affected user collaboration tasks in terms of collaboration occurrence and overall task *** results show that the method significantly improves the speed and accuracy of the collaboration tasks as well as the frequencies of collaboration *** study also confirms the method’s effects on stimulating collaboration activities,as the UOUU method has effectively reduced the participants’perceived workload and the overall moving distances during the *** for generalising the use of the method are discussed.
Wheat is the most widely grown crop in the world,and its yield is closely related to global food *** number of ears is important for wheat breeding and yield ***,automated wheat ear counting techniques are essential f...
详细信息
Wheat is the most widely grown crop in the world,and its yield is closely related to global food *** number of ears is important for wheat breeding and yield ***,automated wheat ear counting techniques are essential for breeding high-yield varieties and increasing grain ***,all existing methods require position-level annotation for training,implying that a large amount of labor is required for annotation,limiting the application and development of deep learning technology in the agricultural *** address this problem,we propose a count-supervised multiscale perceptive wheat counting network(CSNet,count-supervised network),which aims to achieve accurate counting of wheat ears using quantity *** particular,in the absence of location information,CSNet adopts MLP-Mixer to construct a multiscale perception module with a global receptive field that implements the learning of small target attention maps between wheat ear *** conduct comparative experiments on a publicly available global wheat head detection dataset,showing that the proposed count-supervised strategy outperforms existing position-supervised methods in terms of mean absolute error(MAE)and root mean square error(RMSE).This superior performance indicates that the proposed approach has a positive impact on improving ear counts and reducing labeling costs,demonstrating its great potential for agricultural counting *** code is available at .
We present a novel attention-based mechanism to learn enhanced point features for point cloud processing tasks, e.g., classification and segmentation. Unlike prior studies, which were trained to optimize the weights o...
详细信息
We present a novel attention-based mechanism to learn enhanced point features for point cloud processing tasks, e.g., classification and segmentation. Unlike prior studies, which were trained to optimize the weights of a pre-selected set of attention points, our approach learns to locate the best attention points to maximize the performance of a specific task, e.g., point cloud classification. Importantly, we advocate the use of single attention point to facilitate semantic understanding in point feature learning. Specifically,we formulate a new and simple convolution, which combines convolutional features from an input point and its corresponding learned attention point(LAP). Our attention mechanism can be easily incorporated into state-of-the-art point cloud classification and segmentation networks. Extensive experiments on common benchmarks, such as Model Net40, Shape Net Part, and S3DIS, all demonstrate that our LAP-enabled networks consistently outperform the respective original networks, as well as other competitive alternatives, which employ multiple attention points, either pre-selected or learned under our LAP framework.
暂无评论