Dense optical flow estimation plays a key role in many robotic vision tasks. In the past few years, with the advent of deep learning, we have witnessed great progress in optical flow estimation. However, current netwo...
详细信息
Dense optical flow estimation plays a key role in many robotic vision tasks. In the past few years, with the advent of deep learning, we have witnessed great progress in optical flow estimation. However, current networks often consist of a large number of parameters and require heavy computation costs, largely hindering its application on low power-consumption devices such as mobile phones. In this paper, we tackle this challenge and design a lightweight model for fast and accurate optical flow prediction. Our proposed FastFlowNet follows the widely-used coarse-to-fine paradigm with following innovations. First, a new head enhanced pooling pyramid (HEPP) feature extractor is employed to intensify high-resolution pyramid features while reducing parameters. Second, we introduce a new center dense dilated correlation (CDDC) layer for constructing compact cost volume that can keep large search radius with reduced computation burden. Third, an efficient shuffle block decoder (SBD) is implanted into each pyramid level to accelerate flow estimation with marginal drops in accuracy. Experiments on both synthetic Sintel data and real-world KITTI datasets demonstrate the effectiveness of the proposed approach, which needs only 1/10 computation of comparable networks to achieve on par accuracy. In particular, FastFlowNet only contains 1.37M parameters; and can execute at 90 FPS (with a single GTX 1080Ti) or 5.7 FPS (embedded Jetson TX2 GPU) on a pair of Sintel images of resolution 1024 × *** is available at: https://***/fastflow
Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detecti...
Endobronchial intervention is increasingly used as a minimally invasive means for the treatment of pulmonary diseases. In order to reduce the difficulty of manipulation in complex airway networks, robust lumen detection is essential for intraoperative guidance. However, these methods are sensitive to visual artifacts which are inevitable during the surgery. In this work, a cross domain feature interaction (CDFI) network is proposed to extract the structural features of lumens, as well as to provide artifact cues to characterize the visual features. To effectively extract the structural and artifact features, the Quadruple Feature Constraints (QFC) module is designed to constrain the intrinsic connections of samples with various imaging-quality. Furthermore, we design a Guided Feature Fusion (GFF) module to supervise the model for adaptive feature fusion based on different types of artifacts. Results show that the features extracted by the proposed method can preserve the structural information of lumen in the presence of large visual variations, bringing much-improved lumen detection accuracy.
A B S T R A C TIn this paper, we propose a solution to the most important factor that deteriorates the performance of the clustering algorithms. We propose an efficient approach for discovering optimal initialization ...
详细信息
Generative self-supervised learning demonstrates outstanding representation learning capabilities in both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs). However, there are currently no generative...
详细信息
We use a large foundation language model, which is fine-tuned with debate corpora, to develop a robot debater application. To address the limitations of requiring immense computational power in large base language mod...
We use a large foundation language model, which is fine-tuned with debate corpora, to develop a robot debater application. To address the limitations of requiring immense computational power in large base language models, this study takes advantage of the Low Rank Adaption characteristic prevalent in domain expert knowledge. By applying Low Rank Adaption technology and fine-tuning with a dedicated dataset, the computational load is reduced to just one-thousandth of what is needed for a large language model, greatly expanding the application scenarios of robot debaters using large language models. In view of the characteristics of debate competitions, this model can preset a variety of debate scenarios and supports personalized debate processes. It employs intelligent voice recognition technology combined with a multi-channel voice input method, allowing for precise localization of different human debaters and improving the accuracy of voice input recognition. The system can support multiple large-scale language generation models and utilize various different voice broadcasting systems, including male and female voice styles, as well as a range of voice emotions. This model can be applied to debate competitions held in universities, high schools, and various industries. It can support human-machine debates as well as machine-to-machine debates.
In this paper, we present a high-performance deep neural network for weak target image segmentation, including medical image segmentation and infrared image segmentation. To this end, this work analyzes the existing d...
详细信息
Data-free knowledge distillation aims to learn a compact student network from a pre-trained large teacher network without using the original training data of the teacher network. Existing collection-based and generati...
详细信息
Rigid image alignment is a fundamental task in computer vision, while the traditional algorithms are either too sensitive to noise or time-consuming. Recent unsupervised image alignment methods developed based on spat...
详细信息
Object detection is an important and fundamental task in computer vision. Recently, the emergence of deep neural network has made considerable progress in object detection. Deep neural network object detectors can be ...
详细信息
A new method is presented to study the function projective lag synchronization(FPLS) of chaotic systems via adaptive-impulsive control. To achieve synchronization, suitable nonlinear adaptive-impulsive controllers are...
详细信息
A new method is presented to study the function projective lag synchronization(FPLS) of chaotic systems via adaptive-impulsive control. To achieve synchronization, suitable nonlinear adaptive-impulsive controllers are designed. Based on the Lyapunov stability theory and the impulsive control technology, some effective sufficient conditions are derived to ensure the drive system and the response system can be rapidly lag synchronized up to the given scaling function matrix. Numerical simulations are presented to verify the effectiveness and the feasibility of the analytical results.
暂无评论