This study presents a deep-learning (DL) methodology using 3-D convolutional neural networks (CNNs) to detect defects in carbon fiber-reinforced polymer (CFRP) composites through volumetric ultrasonic testing (UT) dat...
详细信息
This study presents a deep-learning (DL) methodology using 3-D convolutional neural networks (CNNs) to detect defects in carbon fiber-reinforced polymer (CFRP) composites through volumetric ultrasonic testing (UT) data. Acquiring large amounts of ultrasonic training data experimentally is expensive and time-consuming. To address this issue, a synthetic data generation method was extended to incorporate volumetric data. By preserving the complete volumetric data, complex preprocessing is reduced, and the model can utilize spatial and temporal information that is lost during imaging. This enables the model to utilize important features that might be overlooked otherwise. The performance of three architectures was compared. The first architecture is prevalent in the literature for the classification of volumetric datasets. The second demonstrated a hand-designed approach to architecture design, with modifications to the first architecture to address the challenges of this specific task. A key modification was the use of cuboidal kernels to account for the large aspect ratios seen in ultrasonic data. The third architecture was discovered through neural architecture search (NAS) from a modified 3-D residual neural network (ResNet) search space. In addition, domain-specific augmentation methods were incorporated during training, resulting in significant improvements in model performance, with a mean accuracy improvement of 22.4% on the discovered architecture. The discovered architecture demonstrated the best performance with a mean accuracy increase of 7.9% over the second-best model. It was able to consistently detect all defects while maintaining a model size smaller than most 2-D ResNets. Each model had an inference time of less than 0.5 s, making them efficient for the interpretation of large amounts of data. [GRAPHICS]
This paper addresses the problem of tracking objects which undergo rapid and significant appearance changes. We propose a novel coupled-layer visual model that combines the target's global and local appearance by ...
详细信息
This paper addresses the problem of tracking objects which undergo rapid and significant appearance changes. We propose a novel coupled-layer visual model that combines the target's global and local appearance by interlacing two layers. The local layer in this model is a set of local patches that geometrically constrain the changes in the target's appearance. This layer probabilistically adapts to the target's geometric deformation, while its structure is updated by removing and adding the local patches. The addition of these patches is constrained by the global layer that probabilistically models the target's global visual properties, such as color, shape, and apparent local motion. The global visual properties are updated during tracking using the stable patches from the local layer. By this coupled constraint paradigm between the adaptation of the global and the local layer, we achieve a more robust tracking through significant appearance changes. We experimentally compare our tracker to 11 state-of-the-art trackers. The experimental results on challenging sequences confirm that our tracker outperforms the related trackers in many cases by having a smaller failure rate as well as better accuracy. Furthermore, the parameter analysis shows that our tracker is stable over a range of parameter values.
This paper addresses the problem of estimating the motion of a camera as it observes the outline ( or apparent contour) of a solid bounded by a smooth surface in successive image frames. In this context, the surface p...
详细信息
This paper addresses the problem of estimating the motion of a camera as it observes the outline ( or apparent contour) of a solid bounded by a smooth surface in successive image frames. In this context, the surface points that project onto the outline of an object depend on the viewpoint and the only true correspondences between two outlines of the same object are the projections of frontier points where the viewing rays intersect in the tangent plane of the surface. In turn, the epipolar geometry is easily estimated once these correspondences have been identified. Given the apparent contours detected in an image sequence, a robust procedure based on RANSAC and a voting strategy is proposed to simultaneously estimate the camera configurations and a consistent set of frontier point projections by enforcing the redundancy of multiview epipolar geometry. The proposed approach is, in principle, applicable to orthographic, weak-perspective, and affine projection models. Experiments with nine real image sequences are presented for the orthographic projection case, including a quantitative comparison with the ground-truth data for the six data sets for which the latter information is available. Sample visual hulls have been computed from all image sequences for qualitative evaluation.
With the advent of smartphones and tablets, video traffic on the Internet has increased enormously. With this in mind, in 2013 the High Efficiency Video Coding (HEVC) standard was released with the aim of reducing the...
详细信息
With the advent of smartphones and tablets, video traffic on the Internet has increased enormously. With this in mind, in 2013 the High Efficiency Video Coding (HEVC) standard was released with the aim of reducing the bit rate (at the same quality) by 50% with respect to its predecessor. However, new contents with greater resolutions and requirements appear every day, making it necessary to further reduce the bit rate. Perceptual video coding has recently been recognized as a promising approach to achieving high-performance video compression and eye tracking data can be used to create and verify these models. In this paper, we present a new algorithm for the bit rate reduction of screen recorded sequences based on the visual perception of videos. An eye tracking system is used during the recording to locate the fixation point of the viewer. Then, the area around that point is encoded with the base quantization parameter (QP) value, which increases when moving away from it. The results show that up to 31.3% of the bit rate may be saved when compared with the original HEVC-encoded sequence, without a significant impact on the perceived quality.
Underwater images suffer from color distortion and low contrast, because light is attenuated while it propagates through water. Attenuation under water varies with wavelength, unlike terrestrial images where attenuati...
详细信息
Underwater images suffer from color distortion and low contrast, because light is attenuated while it propagates through water. Attenuation under water varies with wavelength, unlike terrestrial images where attenuation is assumed to be spectrally uniform. The attenuation depends both on the water body and the 3D structure of the scene, making color restoration difficult. Unlike existing single underwater image enhancement techniques, our method takes into account multiple spectral profiles of different water types. By estimating just two additional global parameters: the attenuation ratios of the blue-red and blue-green color channels, the problem is reduced to single image dehazing, where all color channels have the same attenuation coefficients. Since the water type is unknown, we evaluate different parameters out of an existing library of water types. Each type leads to a different restored image and the best result is automatically chosen based on color distribution. We also contribute a dataset of 57 images taken in different locations. To obtain ground truth, we placed multiple color charts in the scenes and calculated its 3D structure using stereo imaging. This dataset enables a rigorous quantitative evaluation of restoration algorithms on natural images for the first time.
This paper focuses on using feature salience to evaluate the quality of a partition when dealing with hard clustering. It is based on the hypothesis that a good partition is an easy to label partition, i.e. a partitio...
详细信息
This paper focuses on using feature salience to evaluate the quality of a partition when dealing with hard clustering. It is based on the hypothesis that a good partition is an easy to label partition, i.e. a partition for which each cluster is made of salient features. This approach is mostly compared to usual approaches relying on distances between data, but also to more recent approaches based on entropy or stability. We show that our feature-based approach outperforms the compared indexes for optimal model selection: they are more efficient from low- to high-dimensional range as well as they are more robust to noise. To show the efficiency of our indexes on a real-life application, we consider the task of diachronic analysis on a textual dataset. We demonstrate that our approach allows to get some interesting and relevant results in that context, while other approaches mostly lead to unusable results.
This paper presents a complete review of different approaches across all components of the chart image detection and classification up to date. A set of 89 scientific papers is collected, analyzed, and enlisted into f...
详细信息
This paper presents a complete review of different approaches across all components of the chart image detection and classification up to date. A set of 89 scientific papers is collected, analyzed, and enlisted into four categories: chart-type classification, chart text processing, chart data extraction, and chart description generation. Detailed information about problem formulation and a research field is provided, and an overview of used methods in each category. Each paper's contribution is noted, including the essential information for authors in this research field. In the end, a comparison is made between the reported results. The state-of-the-art methods in each category are described, and a research direction is given. We have also analyzed the open challenges that still exist and require the author's attention.
Convolutional neural networks (CNNs) are widely used in machine learning (ML) applications such as imageprocessing. CNN requires heavy computations to provide significant accuracy for many ML tasks. Therefore, the ef...
详细信息
Convolutional neural networks (CNNs) are widely used in machine learning (ML) applications such as imageprocessing. CNN requires heavy computations to provide significant accuracy for many ML tasks. Therefore, the efficient implementations of CNNs to improve performance using limited resources without accuracy reduction is a challenge for ML systems. One of the architectures for the efficient execution of CNNs is the array-based accelerator, that consists of an array of similar processing elements (PEs). The array accelerators are popular as high-performance architecture using the features of parallel computing and data reuse. These accelerators are optimized for a set of CNN layers, not for individual layers. Using the same accelerator dimension size to compute all CNN layers with varying shapes and sizes leads to the resource underutilization problem. We propose a flexible and scalable architecture for array-based accelerator that increases resource utilization by resizing PEs to better match the different shapes of CNN layers. The low-cost partial reconfiguration improves resource utilization and performance, resulting in a 23.2% reduction in computational times of GoogLeNet compared to the state-of-the-art accelerators. The proposed architecture decreases the on-chip memory access rate by 26.5% with no accuracy loss.
Deep neural networks have been tremendously successful at segmenting objects in images. However, it has been shown they still have limitations on challenging problems such as the segmentation of medical images. The ma...
详细信息
Deep neural networks have been tremendously successful at segmenting objects in images. However, it has been shown they still have limitations on challenging problems such as the segmentation of medical images. The main reason behind this lower success resides in the reduced size of the object in the image. In this paper we overcome this limitation through a cyclic collaborative framework, CyCoSeg. The proposed framework is based on a deep active shape model (D-ASM), which provides prior information about the shape of the object, and a semantic segmentation network (SSN). These two models collaborate to reach the desired segmentation by influencing each other: SSN helps D-ASM identify relevant keypoints in the image through an Expectation Maximization formulation, while D-ASM provides a segmentation proposal that guides the SSN. This cycle is repeated until both models converge. Extensive experimental evaluation shows CyCoSeg boosts the performance of the baseline models, including several popular SSNs, while avoiding major architectural modifications. The effectiveness of our method is demonstrated on the left ventricle segmentation on two benchmark datasets, where our approach achieves one of the most competitive results in segmentation accuracy. Furthermore, its generalization is demonstrated for lungs and kidneys segmentation in CT scans.
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current tr...
详细信息
When considering sparse motion capture marker data, one typically struggles to balance its overfitting via a high dimensional blendshape system versus underfitting caused by smoothness constraints. With the current trend towards using more and more data, our aim is not to fit the motion capture markers with a parameterized (blendshape) model or to smoothly interpolate a surface through the marker positions, but rather to find an instance in the high resolution dataset that contains local geometry to fit each marker. Just as is true for typical machine learning applications, this approach benefits from a plethora of data, and thus we also consider augmenting the dataset via specially designed physical simulations that target the high resolution dataset such that the simulation output lies on the same so-called manifold as the data targeted.
暂无评论