The spatial and spectral information contained in the hyperspectral image (HSI) make it widely used in many fields. However, the sharp increase of HSI data brings enormous pressure to the data storage and real-time tr...
详细信息
The spatial and spectral information contained in the hyperspectral image (HSI) make it widely used in many fields. However, the sharp increase of HSI data brings enormous pressure to the data storage and real-time transmission. The research shows that hyperspectral compressive sensing (HCS) breaks through the bottleneck of the Nyquist sampling theorem, which can relieve the massive pressure on data storage and real-time transmission. Existing HCS methods try to design advanced compression sampling matrix or reconstruction algorithms, but cannot connect the two through a unified framework. To further improve the image reconstruction quality, a novel codec space-spectrum joint dense residual network (CDS2-DResN) is proposed. The CDS2-DResN is divided into block compression sampling part and reconstruction part. For block compression sampling, coded convolutional layer (CCL) is leveraged to compress and sample HSI. For measurements reconstruction, deconvolution layer is first leveraged to initially reconstruct HSI, and then build a space-spectrum joint network to refine the initial reconstructed HSI. Moreover, the CCL and reconstruction network are optimized via a unified framework, which can simplify the pre-processing and post-processing process of HCS. Extensive experiments have shown that CDS2-DResN has an excellent HCS reconstruction effect at measurement rates 0.25, 0.10, 0.04 and 0.01, respectively.
Identifying travel modes from Global Navigation Satellite System (GNSS) trajectories is helpful for traffic management. In mode identification, the motion features are extracted from trajectories to train the classifi...
详细信息
Identifying travel modes from Global Navigation Satellite System (GNSS) trajectories is helpful for traffic management. In mode identification, the motion features are extracted from trajectories to train the classifiers. However, features would be distorted by the positioning noise when migrating existing frameworks to poor-quality tracks. This study aims to answer how to eliminate the impact of positioning error on mode identification. Specifically, six widely used Trajectory Noise Reduction (TNR) methods were tested. Representative motion features were calculated and sent to several classical classifiers to evaluate the effect of TNR. Then, the extent to which TNR restores motion features is analysed by information gain. To verify the robustness of these methods, multiple noise scenarios are designed to simulate possible positioning noise. The results show that the trajectory smoothing methods perform better than the outlier elimination methods regardless of the type and magnitude of noise. In particular, the Gaussian kernel smoothing can achieve the highest effect in almost all noise scenarios. For untested TNR methods that require a time window radius parameter, a 30-s time window is a good candidate. Moreover, the visualisation verification cannot ensure the best TNR method for travel mode identification.
Given the rapid growth of commercial pig farms, the need to automatically monitor pig behaviour becomes more important in order to assist farmers. Recent advances in convolutional neural networks may pave the way for ...
详细信息
Given the rapid growth of commercial pig farms, the need to automatically monitor pig behaviour becomes more important in order to assist farmers. Recent advances in convolutional neural networks may pave the way for new solutions. However, the primary task of individual pig detection under real-world conditions is still a challenging task. Previous studies used anchor-based frameworks that are unsuitable for such crowded scenarios with extreme overlapping. Furthermore, most applications focus on specific levels of brightness, farm facilities, or pig species without considering generalization. To tackle these problems, an anchor-free pig detection method based on pig centre localization is first proposed. Then, a novel negative training data augmentation technique is introduced using examples from outside the training distribution. Furthermore, using the test time augmentation technique is proposed to improve the model performance. Experiments are conducted on two online pig detection datasets;the network surpasses state-of-the-art results for both datasets. It is also found that the proposed method outperforms the latest anchor-free techniques commonly used in crowded scenarios. The method can detect pigs individually, even if their bounding boxes overlap strongly or occlude each other. Moreover, the real-time system achieves an improvement of 10% in Fmeasure$F_{\text{measure}}$ when testing in unconstrained real-world conditions.
Data augmentation is an important pre-processing step for object detection in 2D image and 3D point clouds. However, studies on multimodal data augmentation are extremely limited compared to single-modal work. Moreove...
详细信息
Data augmentation is an important pre-processing step for object detection in 2D image and 3D point clouds. However, studies on multimodal data augmentation are extremely limited compared to single-modal work. Moreover, simultaneously ensuring consistency and rationality when pasting both image and point cloud samples is a major challenge in multimodal methods. In this study, a novel multimodal data augmentation method based on ground truth sampling (GT sampling) is proposed for generating content-rich synthetic scenes. A GT database and scene ground database based on the raw training set is initially built, following which the context of the image and point cloud is used to guide the paste location and filtering strategy of the GT samples. The proposed method can avoid the cluttered features caused by random pasting of samples;the image context information can help the model to learn the correlation between the object and the environment more comprehensively, and the point cloud context information can reduce occlusion in the case of long-distance objects. The effectiveness of the proposed strategy is demonstrated on the publicly available KITTI dataset. Utilizing the multimodal 3D detector MVXNet as an implementation tool, our experiments evaluate different superimposition strategies ranging from context-free sample pasting methods to context-guided new training scenes. In comparison with existing GT sampling methods, our method exhibits a relative performance improvement of 15% on benchmark datasets. In ablation studies, our sample pasting strategy achieves a +2.81% gain compared with previous work. In conclusion, considering the multimodal context of modelled objects is crucial for placing them in the correct environment.
Depth maps are acquirable and irreplaceable geometric information that significantly enhances traditional color images. RGB and Depth (RGBD) images have been widely used in various image analysis applications, but the...
详细信息
Depth maps are acquirable and irreplaceable geometric information that significantly enhances traditional color images. RGB and Depth (RGBD) images have been widely used in various image analysis applications, but they are still very limited due to challenges from different modalities and misalignment between color and depth. In this paper, a Fully Aligned Fusion Network (FAFNet) for RGBD semantic segmentation is presented. To improve cross-modality fusion, a new RGBD fusion block is proposed, features from color images and depth maps are first fused by an attention cross fusion module and then aligned by a semantic flow. A multi-layer structure is also designed to hierarchically utilize the RGBD fusion block, which not only eases issues of low resolution and noises for depth maps but also reduces the loss of semantic features in the upsampling process. Quantitative and qualitative evaluations on both the NYU-Depth V2 and the SUN RGB-D dataset demonstrate that the FAFNet model outperforms state-of-the-art RGBD semantic segmentation methods.
Semantic segmentation is a classical problem in computervision, which is important in the field of autonomous driving. Although significant progress has been achieved in semantic segmentation, its generalization abil...
详细信息
Semantic segmentation is a classical problem in computervision, which is important in the field of autonomous driving. Although significant progress has been achieved in semantic segmentation, its generalization ability to unknown domains is still challenging. To effectively solve this problem, a semantic segmentation method ImDeeplabV3plus with instance selective whitening loss is proposed in this paper. DeeplabV3plus is selected as the baseline. In order to enhance the representation of the region of interest, the coordinate attention (CA) mechanism is added. To better integrate multiple low-level features, the adaptively spatial feature fusion (ASFF) is employed to adaptively learn the importance of features at different levels for each location. For preferably coping with the domain changes, an instance selective whitening (ISW) loss is introduced in the early stage of the backbone. The model is trained with the Cityscapes dataset and then applied to the unknown domain RobotCar dataset. Compared with DeeplabV3plus, the authors' ImDeeplabV3plus model shows 1.29% mIoU improvement. When ISW loss is added, 2.08% improvement in mIoU is achieved compared with ImDeeplabV3plus. Experimental results show that the proposed method is simple and improves the domain generalization ability.
Intelligent transportation and smart city applications are currently on the rise. In many applications, diverse and accurate sensor perception of vehicles is crucial. Relevant information could be conveniently acquire...
详细信息
Intelligent transportation and smart city applications are currently on the rise. In many applications, diverse and accurate sensor perception of vehicles is crucial. Relevant information could be conveniently acquired with traffic cameras, as there is an abundance of cameras in cities. However, cameras have to be calibrated in order to acquire position data of vehicles. This paper proposes a novel automated calibration approach for partially connected vehicle environments. The approach utilises Global Navigation Satellite System positioning information shared by connected vehicles. Corresponding vehicle Global Navigation Satellite System locations and image coordinates are utilised to fit a direct transformation between image and ground plane coordinates. The proposed approach was validated with a research vehicle equipped with a Real-Time Kinematic-corrected Global Navigation Satellite System receiver driving past three different cameras. On average, the camera estimates contained errors ranging from 1.5 to 2.0 m, when compared to the Global Navigation Satellite System positions of the vehicle. Considering the vast lengths of the overlooked road sections, up to 140 m, the accuracy of the camera-based localisation should be adequate for a number of intelligent transportation applications. In future, the calibration approach should be evaluated with fusion of stand-alone Global Navigation Satellite System positioning and inertial measurements, to validate the calibration methodology with more common vehicle sensor equipment.
Coronavirus Disease 2019 (Covid-19) overtook the worldwide in early 2020, placing the world's health in threat. Automated lung infection detection using Chest X-ray images has a ton of potential for enhancing the ...
详细信息
Coronavirus Disease 2019 (Covid-19) overtook the worldwide in early 2020, placing the world's health in threat. Automated lung infection detection using Chest X-ray images has a ton of potential for enhancing the traditional covid-19 treatment strategy. However, there are several challenges to detect infected regions from Chest X-ray images, including significant variance in infected features similar spatial characteristics, multi-scale variations in texture shapes and sizes of infected regions. Moreover, high parameters with transfer learning are also a constraints to deploy deep convolutional neural network(CNN) models in real time environment. A novel covid-19 lightweight CNN(LW-CovidNet) method is proposed to automatically detect covid-19 infected regions from Chest X-ray images to address these challenges. In our proposed hybrid method of integrating Standard and Depth-wise Separable convolutions are used to aggregate the high level features and also compensate the information loss by increasing the Receptive Field of the model. The detection boundaries of disease regions representations are then enhanced via an Edge-Attention method by applying heatmaps for accurate detection of disease regions. Extensive experiments indicate that the proposed LW-CovidNet surpasses most cutting-edge detection methods and also contributes to the advancement of state-of-the-art performance. It is envisaged that with reliable accuracy, this method can be introduced for clinical practices in the future.
Cross-modality person re-identification (Re-ID) aims to retrieve a query identity from red, green, blue (RGB) images or infrared (IR) images. Many approaches have been proposed to reduce the distribution gap between R...
详细信息
Cross-modality person re-identification (Re-ID) aims to retrieve a query identity from red, green, blue (RGB) images or infrared (IR) images. Many approaches have been proposed to reduce the distribution gap between RGB modality and IR modality. However, they ignore the valuable collaborative relationship between RGB modality and IR modality. Hybrid Mutual Learning (HML) for cross-modality person Re-ID is proposed, which builds the collaborative relationship by using mutual learning from the aspects of local features and triplet relation. Specifically, HML contains local-mean mutual learning and triplet mutual learning where they focus on transferring local representational knowledge and structural geometry knowledge so as to reduce the gap between RGB modality and IR modality. Furthermore, Hierarchical Attention Aggregation is proposed to fuse local feature maps and local feature vectors to enrich the information of the classifier input. Extensive experiments on two commonly used data sets, that is, SYSU-MM01 and RegDB verify the effectiveness of the proposed method.
Obtaining accurate segmentation of central serous chorioretinopathy in spectral-domain optical coherence tomography (SD-OCT) is critical for the determination of the disease severity. Although existing methods achieve...
详细信息
Obtaining accurate segmentation of central serous chorioretinopathy in spectral-domain optical coherence tomography (SD-OCT) is critical for the determination of the disease severity. Although existing methods achieve considerable segmentation results, they heavily depend on large-scale data with high-quality annotations. Also, the lesions bear a large shape variation across different patients, which are often difficult to encode. To address the above problems, we propose a fine-to-coarse-to-fine weakly supervised framework. Specifically, global alternate max-avg pooling (GTP) network can be employed to locate the lesion regions accurately by using only image-level annotations. A network module based on the GTP network and a semantic transfer module are proposed to iteratively guide the network to continuously discover and expand the target lesion regions. Then, we employ 3D grey distribution histogram to generate pseudo-volumetric labels. Finally, a novel 3D level set loss function is proposed to perform coarse-to-fine volumetric segmentation. Experiments on a challenging dataset demonstrate that the performance of our proposed method is closer to those of models trained with pixel-level supervision.
暂无评论