In this paper, we argue that modern pre-integration methods for inertial measurement units (IMUs) are accurate enough to ignore the drift for short time intervals. This allows us to consider a simplified camera model,...
详细信息
ISBN:
(纸本)9781665487399
In this paper, we argue that modern pre-integration methods for inertial measurement units (IMUs) are accurate enough to ignore the drift for short time intervals. This allows us to consider a simplified camera model, which in turn admits further intrinsic calibration. We develop the first-ever solver to jointly solve the relative pose problem with unknown and equal focal length and radial distortion profile while utilizing the IMU data. Furthermore, we show significant speed-up compared to state-of-the-art algorithms, with small or negligible loss in accuracy for partially calibrated setups. The proposed algorithms are tested on both synthetic and real data, where the latter is focused on navigation using unmanned aerial vehicles (UAVs). We evaluate the proposed solvers on different commercially available low-cost UAVs, and demonstrate that the novel assumption on IMU drift is feasible in real-life applications. The extended intrinsic auto-calibration enables us to use distorted input images, making tedious calibration processes obsolete, compared to current state-of-the-art methods.
Currently, mixed reality head-mounted displays tracking the full body of users is an important human-computer interaction mode through the pose of the head and the hands. Unfortunately, users' virtual representati...
详细信息
Out-of-distribution detection is crucial to the safe deployment of machine learning systems. Currently, unsupervised out-of-distribution detection is dominated by generative-based approaches that make use of estimates...
详细信息
Recently, a number of CNN based methods have made great progress in single image super-resolution. However, these existing architectures commonly build massive number of network layers, bringing high computational com...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Recently, a number of CNN based methods have made great progress in single image super-resolution. However, these existing architectures commonly build massive number of network layers, bringing high computational complexity and heavy memory consumption, which is inappropriate to be applied on embedded terminals such as mobile platforms. In order to solve this problem, we propose a hybrid network of CNN and Transformer (HNCT) for lightweight image super-resolution. In general, HNCT consists of four parts, which are shallow feature extraction module, Hybrid Blocks of CNN and Transformer (HBCTs), dense feature fusion module and up-sampling module, respectively. By combining CNN and Transformer, HBCT extracts deep features beneficial for super-resolution reconstruction in consideration of both local and non-local priors, while being lightweight and flexible enough. Enhanced spatial attention is introduced in HBCT to further improve performance. Extensive experimental results show our HNCT is superior to the state-of-the-art methods in terms of super-resolution performance and model complexity. Moreover, we won the second best PSNR and the least activation operations in NTIRE 2022 Efficient SR Challenge. Code is available at https://***/lhjthp/HNCT.
Recent advances in single image super-resolution (SISR) have achieved extraordinary performance, but the computational cost is too heavy to apply in edge devices. To alleviate this problem, many novel and effective so...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Recent advances in single image super-resolution (SISR) have achieved extraordinary performance, but the computational cost is too heavy to apply in edge devices. To alleviate this problem, many novel and effective solutions have been proposed. Convolutional neural network (CNN) with the attention mechanism has attracted increasing attention due to its efficiency and effectiveness. However, there is still redundancy in the convolution operation. In this paper, we propose Blueprint Separable Residual Network (BSRN) containing two efficient designs. One is the usage of blueprint separable convolution (BSConv), which takes place of the redundant convolution operation. The other is to enhance the model ability by introducing more effective attention modules. The experimental results show that BSRN achieves state-of-the-art performance among existing efficient SR methods. Moreover, a smaller variant of our model BSRN-S won the first place in model complexity track of NTIRE 2022 Efficient SR Challenge. The code is available at https://***/xiaom233/BSRN.
Single-Image-Super-Resolution (SISR) is a classical computervision problem that has benefited from the recent advancements in deep learning methods, especially the advancements of convolutional neural networks (CNN)....
详细信息
ISBN:
(纸本)9781665487399
Single-Image-Super-Resolution (SISR) is a classical computervision problem that has benefited from the recent advancements in deep learning methods, especially the advancements of convolutional neural networks (CNN). Although state-of-the-art methods improve the performance of SISR on several datasets, direct application of these networks for practical use is still an issue due to heavy computational load. For this purpose, recently, researchers have focused on more efficient and high-performing network structures. Information multi-distilling network (IMDN) is one of the highly efficient SISR networks with high performance and low computational load. IMDN achieves this efficiency with various mechanisms such as Intermediate Information Collection (IIC), working in a global setting, Progressive Refinement Module (PRM), and Contrast Aware Channel Attention (CCA), employed in a local setting. These mechanisms, however, do not equally contribute to the efficiency and performance of IMDN. In this work, we propose the Global Progressive Refinement Module (GPRM) as a less parameter-demanding alternative to the IIC module for feature aggregation. To further decrease the number of parameters and floating point operations per second (FLOPS), we also propose Grouped Information Distilling Blocks (GIDB). Using the proposed structures, we design an efficient SISR network called IMDeception. Experiments reveal that the proposed network performs on par with state-of-the-art models despite having a limited number of parameters and FLOPS. Furthermore, using grouped convolutions as a building block of GIDB increases room for further optimization during deployment. To show its potential, the proposed model was deployed on NVIDIA Jetson Xavier AGX and it has been shown that it can run in real-time on this edge device.
Innovations in computervision algorithms for satellite image analysis can enable us to explore global challenges such as urbanization and land use change at the planetary level. However, domain shift problems are a c...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Innovations in computervision algorithms for satellite image analysis can enable us to explore global challenges such as urbanization and land use change at the planetary level. However, domain shift problems are a common occurrence when trying to replicate models that drive these analyses to new areas, particularly in the developing world. If a model is trained with imagery and labels from one location, then it usually will not generalize well to new locations where the content of the imagery and data distributions are different. In this work, we consider the setting in which we have a single large satellite imagery scene over which we want to solve an applied problem - building footprint segmentation. Here, we do not necessarily need to worry about creating a model that generalizes past the borders of our scene but can instead train a local model. We show that surprisingly few labels are needed to solve the building segmentation problem with very high-resolution (0.5m/px) satellite imagery with this setting in mind. Our best model trained with just 527 sparse polygon annotations (an equivalent of 1500x1500 densely labeled pixels) has a recall of 0.87 over held out footprints and a R2 of 0.93 on the task of counting the number of buildings in 200x200 meter windows. We apply our models over high-resolution imagery in Amman, Jordan in a case study on urban change detection. [GRAPHICS] .
Deep neural networks (DNNs) are vulnerable to adversarial examples generated by adding malicious noise imperceptible to a human. The adversarial examples successfully fool the models under the white-box setting, but t...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Deep neural networks (DNNs) are vulnerable to adversarial examples generated by adding malicious noise imperceptible to a human. The adversarial examples successfully fool the models under the white-box setting, but the performance of attacks under the black-box setting degrades significantly, which is known as the low transferability problem. Various methods have been proposed to improve transferability, yet they are not effective against adversarial training and defense models. In this paper, we introduce two new methods termed Lookahead Iterative Fast Gradient Sign Method (LI-FGSM) and Self-CutMix (SCM) to address the above issues. LI-FGSM updates adversarial perturbations with the accumulated gradient obtained by looking ahead. A previous gradient-based attack is used for looking ahead during N steps to explore the optimal direction at each iteration. It allows the optimization process to escape the sub-optimal region and stabilize the update directions. SCM leverages the modified CutMix, which copies a patch from the original image and pastes it back at random positions of the same image, to preserve the internal information. SCM makes it possible to generate more transferable adversarial examples while alleviating the overfitting to the surrogate model employed. Our two methods are easily incorporated with the previous iterative gradient-based attacks. Extensive experiments on ImageNet show that our approach acquires state-of-the-art attack success rates not only against normally trained models but also against adversarial training and defense models.
The aim of this paper is to propose a large scale dataset for image restoration (LSDIR). Recent work in image restoration has been focused on the design of deep neural networks. The datasets used to train these networ...
详细信息
Organ level instance segmentation (e.g., individual leaves) based on computervision techniques is a key step in the measurement of plant phenotypes. Since plant organs, especially leaves, are self-occluded and emerge...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Organ level instance segmentation (e.g., individual leaves) based on computervision techniques is a key step in the measurement of plant phenotypes. Since plant organs, especially leaves, are self-occluded and emerged-occluded, single-view images affect the acquisition of some effective information. However, 3D global images contain much more plant morphological information than single-view images, and it is of great significance for plant phenotype research. In this paper, lettuce was taken as the research object, its 3D point cloud images were obtained and instance segmentation was carried out based on the deep learning method. The result showed that the 3D point cloud of each leaf was segmented and identified accurately. Specifically, we constructed a lettuce point cloud dataset consisting of 620 real and synthetic point clouds and fused them together to train a 3D instance segmentation network-PartNet, which directly takes 3D point clouds as input and its output is the instance segmentation results of leaves. The experimental results showed that, when tested with 40 point clouds in the validation set, the metric Average Precision (%) with IoU threshold being 0.25 reached 97.2%, and with IoU threshold being 0.5 reached 92.4% respectively, indicating that the constructed PartNet network has the potential to accurately segment the 3D point cloud leaf instances for lettuce.
暂无评论