A key challenge for LiDAR-based 3D object detection is to capture sufficient features from large scale 3D scenes especially for distant or/and occluded objects. Albeit recent efforts made by Transformers with the long...
详细信息
The visual models pretrained on large-scale benchmarks encode general knowledge and prove effective in building more powerful representations for downstream tasks. Most existing approaches follow the fine-tuning parad...
The visual models pretrained on large-scale benchmarks encode general knowledge and prove effective in building more powerful representations for downstream tasks. Most existing approaches follow the fine-tuning paradigm, either by initializing or regularizing the downstream model based on the pretrained one. The former fails to retain the knowledge in the successive fine-tuning phase, thereby prone to be over-fitting, and the latter imposes strong constraints to the weights or feature maps of the downstream model without considering semantic drift, often incurring insufficient optimization. To deal with these issues, we propose a novel fine-tuning framework, namely distribution regularization with semantic calibration (DR-Tune). It employs distribution regularization by enforcing the downstream task head to decrease its classification error on the pretrained feature distribution, which prevents it from over-fitting while enabling sufficient training of downstream encoders. Furthermore, to alleviate the interference by semantic drift, we develop the semantic calibration (SC) module to align the global shape and class centers of the pretrained and downstream feature distributions. Extensive experiments on widely used image classification datasets show that DR-Tune consistently improves the performance when combing with various backbones under different pretraining strategies. Code is available at: https://***/weeknan/DR-Tune.
Security and privacy issues have attracted the attention of researchers in the field of IoT as the information processing scale grows in sensor *** computing,theoretically known as an absolutely secure way to store an...
详细信息
Security and privacy issues have attracted the attention of researchers in the field of IoT as the information processing scale grows in sensor *** computing,theoretically known as an absolutely secure way to store and transmit information as well as a speed-up way to accelerate local or distributed classical algorithms that are hard to solve with polynomial complexity in computation or *** this paper,we focus on the phase estimation method that is crucial to the realization of a general multi-party computing model,which is able to be accelerated by quantum algorithms.A novel multi-party phase estimation algorithm and the related quantum circuit are proposed by using a distributed Oracle operator with *** proved theoretical communication complexity of this algorithm shows it can give the phase estimation before applying multi-party computing efficiently without increasing any additional ***,a practical problem of multi-party dating investigated shows it can make a successful estimation of the number of solution in advance with zero communication complexity by utilizing its special statistic *** simulations present the correctness,validity and efficiency of the proposed estimation method.
The visual models pretrained on large-scale benchmarks encode general knowledge and prove effective in building more powerful representations for downstream tasks. Most existing approaches follow the fine-tuning parad...
详细信息
The development of the Internet of Things (IoT) has allowed devices to collect massive amounts of data, and Artificial Intelligence (AI) provides the ability to analyze those data. Moreover, researchers adopt Distribu...
The development of the Internet of Things (IoT) has allowed devices to collect massive amounts of data, and Artificial Intelligence (AI) provides the ability to analyze those data. Moreover, researchers adopt Distributed Machine Learning (DML) methods to train neural networks collaboratively using different users' data. However, DML suffers from privacy issues, and Federated Learning (FL) has been an effective solution. FL transfers the model instead of the data to protect privacy, but the trained models have low accuracies over local datasets due to statistical heterogeneity. Thus, personalized FL (pFL) algorithms have been proposed to handle such heterogeneous data distribution. However, the communication overhead in the pFL algorithms is significant as it requires transmitting additional information. Thus, we propose Federated Learning with Com-bined Particle Swarm Optimization (FedCPSO) in this paper. FedCPSO replaces the aggregation process of FL algorithms with PSO, and we design a velocity in PSO specifically for FL algorithms, using the best global model, the best client models, and the best neighbor models. In addition, we also implement magnitude pruning to reduce the communication volume. The experimental results illustrate that FedCPSO can reduce up to 50% communication volume while having less than a 2% accuracy drop compared with the State-of-the-art (SOTA) pFL algorithm.
A key challenge for LiDAR-based 3D object detection is to capture sufficient features from large scale 3D scenes especially for distant or/and occluded objects. Albeit recent efforts made by Transformers with the long...
A key challenge for LiDAR-based 3D object detection is to capture sufficient features from large scale 3D scenes especially for distant or/and occluded objects. Albeit recent efforts made by Transformers with the long sequence modeling capability, they fail to properly balance the accuracy and efficiency, suffering from inadequate receptive fields or coarse-grained holistic correlations. In this paper, we propose an Octree-based Transformer, named OcTr, to address this issue. It first constructs a dynamic octree on the hierarchical feature pyramid through conducting self-attention on the top level and then recursively propagates to the level below restricted by the octants, which captures rich global context in a coarse-to-fine manner while maintaining the computational complexity under control. Furthermore, for enhanced foreground perception, we propose a hybrid positional embedding, composed of the semantic-aware positional embedding and attention mask, to fully exploit semantic and geometry clues. Extensive experiments are conducted on the Waymo Open Dataset and KITTI Dataset, and OcTr reaches newly state-of-the-art results.
Although deep learning (DL) has obtained great achievements in the industry, the involvement of artificial intelligence (AI) experts in developing customized DL services raises high costs and hinders its wide applicat...
详细信息
Although deep learning (DL) has obtained great achievements in the industry, the involvement of artificial intelligence (AI) experts in developing customized DL services raises high costs and hinders its wide application in the business domain. In this research, a Web-based automatic DL service generation system is presented to address the problem. The system can generate customized DL services without involving AI experts. The main principle of the system adopts ontology technologies to organize DL domain knowledge and generate target services based on the user's requests posted from the front-end web page. In the empirical study, the whole scenario of the system is demonstrated, and the scalability is also evaluated. The result shows that our system can generate customized services correctly and has good scalability.
This paper presents ER-NeRF, a novel conditional Neural Radiance Fields (NeRF) based architecture for talking portrait synthesis that can concurrently achieve fast convergence, real-time rendering, and state-of-the-ar...
详细信息
With the proliferation of cloud services, the work of large-scale image retrieval is carried out on remote cloud become a trend in order to get rid of the storage burden and computation. However, traditional retrieval...
详细信息
Pre-trained point cloud models have found extensive applications in 3D understanding tasks like object classification and part segmentation. However, the prevailing strategy of full fine-tuning in downstream tasks lea...
Pre-trained point cloud models have found extensive applications in 3D understanding tasks like object classification and part segmentation. However, the prevailing strategy of full fine-tuning in downstream tasks leads to large per-task storage overhead for model parameters, which limits the efficiency when applying large-scale pre-trained models. Inspired by the recent success of visual prompt tuning (VPT), this paper attempts to explore prompt tuning on pre-trained point cloud models, to pursue an elegant balance between performance and parameter efficiency. We find while instance-agnostic static prompting, e.g. VPT, shows some efficacy in downstream transfer, it is vulnerable to the distribution diversity caused by various types of noises in real-world point cloud data. To conquer this limitation, we propose a novel Instance-aware Dynamic Prompt Tuning (IDPT) strategy for pre-trained point cloud models. The essence of IDPT is to develop a dynamic prompt generation module to perceive semantic prior features of each point cloud instance and generate adaptive prompt tokens to enhance the model's robustness. Notably, extensive experiments demonstrate that IDPT outperforms full finetuning in most tasks with a mere 7% of the trainable parameters, providing a promising solution to parameter-efficient learning for pre-trained point cloud models. Code is available at https://***/zyh16143998882/ICCV23-IDPT.
暂无评论