This paper used Time-Frequency Analysis (TFA) techniques for signal processing on tasks of computer vision. Our main idea is as follows: To build a simple network architecture without two or more convolutional neural ...
详细信息
ISBN:
(纸本)9781665475938
This paper used Time-Frequency Analysis (TFA) techniques for signal processing on tasks of computer vision. Our main idea is as follows: To build a simple network architecture without two or more convolutional neural networks (CNNs), ana-lyze hidden features by Discrete Wavelet Transform (DWT), and send them into filters as weights by convolutions, transformers or other methods. And we do not need to build the network with 2 or more stages to accomplish this idea. Actually, we try to directly use TFA skills on CNN to build one-stage network. Networks which build by this way not only keep their outstanding performance, but also cost lower computing resources. In this paper, we mainly use DWT on CNN to solve image inpainting problems. And the results show that our model can work stably in frequency domain to realize free-form image inpainting.
High-resolution (HR) spinal endoscopic images are essential to enhance the surgeon’s visual presence for the guidance of surgical procedures. However, available image super-resolution methods, especially deep learnin...
详细信息
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image ...
详细信息
ISBN:
(纸本)9781728180687
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image features tend to be grouped by giving a set of organizing principles. In this paper, we propose an approach for the detection of perceptual groups in an image. We are mainly interested in features grouped by the proximity law of Gestalt. We conceive an object-based model within a stochastic framework using a marked point process (MPP). We use a Bayesian learning method to extract perceptual groups in a scene. The proposed model tested on synthetic images proves the efficient detection of perceptual groups in noisy images.
The fundamental task in high-speed and high-accuracy CCTV video surveillance system is attention-based imagevisual point detection and matching. The image viewpoints are different at different instants of time, so th...
详细信息
Spatial frequency analysis and transforms serve a central role in most engineered image and video lossy codecs, but are rarely employed in neural network (NN)-based approaches. We propose a novel NN-based image coding...
详细信息
ISBN:
(纸本)9781665475938
Spatial frequency analysis and transforms serve a central role in most engineered image and video lossy codecs, but are rarely employed in neural network (NN)-based approaches. We propose a novel NN-based image coding framework that utilizes forward wavelet transforms to decompose the input signal by spatial frequency. Our encoder generates separate bitstreams for each latent representation of low and high frequencies. This enables our decoder to selectively decode bitstreams in a quality-scalable manner. Hence, the decoder can produce an enhanced image by using an enhancement bitstream in addition to the base bitstream. Furthermore, our method is able to enhance only a specific region of interest (ROI) by using a corresponding part of the enhancement latent representation. Our experiments demonstrate that the proposed method shows competitive rate-distortion performance compared to several non-scalable image codecs. We also showcase the effectiveness of our two-level quality scalability, as well as its practicality in ROI quality enhancement.
Applying encryption technology to image retrieval can ensure the security and privacy of personal images. The related researches in this field have focused on the organic combination of encryption algorithm and artifi...
详细信息
ISBN:
(纸本)9781665475938
Applying encryption technology to image retrieval can ensure the security and privacy of personal images. The related researches in this field have focused on the organic combination of encryption algorithm and artificial feature extraction. Many existing encrypted image retrieval schemes cannot prevent feature leakage and file size increase or cannot achieve satisfied retrieval performance. In this paper, a new end-to-end encrypted image retrieval scheme is presented. First, images are encrypted by using block rotation, new orthogonal transforms and block permutation during the JPEG compression process. Second, we combine the triplet loss and the cross entropy loss to train a network model, which contains gMLP modules, by end-to-end learning for extracting cipher-images' features. Compared with manual features extraction such as extracting color histogram, the end-to-end mechanism can economize on manpower. Experimental results show that our scheme has good retrieval performance, while can ensure compression friendly and no feature leakage.
This paper proposes a novel methodology that utilizes a newly developed visual Internet of Things (IoT) system for resilient natural disaster mitigation. This system enables the detection of disasters through remote c...
This paper proposes a novel methodology that utilizes a newly developed visual Internet of Things (IoT) system for resilient natural disaster mitigation. This system enables the detection of disasters through remote control functions integrated with visual IoT sensors and artificial intelligence (AI)-based imageprocessing of images captured by these sensors. The system is designed using commercial off-the-shelf (COTS) components to reduce installation costs, making it feasible for deployment worldwide, including in developing countries. In 2023, the proposal for this system was presented to the ITUD with the aim of widespread implementation in developing nations. The paper introduces innovative methodologies for visual IoT in international disaster mitigation, accompanied by use cases and detailed technological insights.
The proceedings contain 97 papers. The special focus in this conference is on Computer Vision and imageprocessing. The topics include: Automatic Classification of Sedimentary Rocks Towards Oil Reservoirs Detection;si...
ISBN:
(纸本)9783031113451
The proceedings contain 97 papers. The special focus in this conference is on Computer Vision and imageprocessing. The topics include: Automatic Classification of Sedimentary Rocks Towards Oil Reservoirs Detection;signature2Vec - An Algorithm for Reference Frame Agnostic Vectorization of Handwritten Signatures;leaf Segmentation and Counting for Phenotyping of Rosette Plants Using Xception-style U-Net and Watershed Algorithm;fast and Secure Video Encryption Using Divide-and-Conquer and Logistic Tent Infinite Collapse Chaotic Map;visual Localization Using Capsule Networks;detection of Cataract from Fundus images Using Deep Transfer Learning;brain Tumour Segmentation Using Convolution Neural Network;signature Based Authentication: A Multi-label Classification Approach to Detect the Language and Forged Sample in Signature;a Data-Set and a Real-Time Method for Detection of Pointing Gesture from Depth images;Deep-TDRS: An Integrated System for Handwritten Text Detection-Recognition and Conversion to Speech Using Deep Learning;VISION HELPER: CNN Based Real Time Navigator for the visually Impaired;structure-Texture Decomposition-Based Enhancement Framework for Weakly Illuminated images;low Cost Embedded Vision System for Location and Tracking of a Color Object;towards Label-Free Few-Shot Learning: How Far Can We Go?;AB-net: Adult- Baby Net;Polarimetric SAR Classification: Fast Learning with k-Maximum Likelihood Estimator;Leveraging Discriminative Cues for Masked Face Recognition in Post COVID World;pretreatment Identification of Oral Leukoplakia and Oral Erythroplakia Metastasis Using Deep Learning Neural Networks;soft Biometric Based Person Retrieval for Burglary Investigation;a Deep Learning Framework for the Classification of Lung Diseases Using Chest X-Ray images;computer Aided Diagnosis of Autism Spectrum Disorder Based on Thermal Imaging;scene Graph Generation with Geometric Context;deep Color Spaces for Fingerphoto Presentation Attack Detection in Mobile Devices;can
The appearance or external features is one of the important aspect, it impacts the consumer’s market value, desires, and choice but also, to a certain degree, its internal consistency. Color, texture, size, shape, an...
详细信息
Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotem...
详细信息
ISBN:
(纸本)9781665475938
Recently, various compressed video quality enhancement technologies have been proposed to overcome the visual artifacts. Most existing methods are based on optical flow or deformable alignment to explore the spatiotemporal information across frames. However, inaccurate motion estimation and training instability of deformable convolution would be detrimental to the reconstruction performance. In this paper, we design a bi-directional recurrent network equipping with enhanced deformable alignment and attention-guided aggregation to promote information flows among frames. For the alignment, a pair of scale and shift parameters are learned to modulate optical flows into new offsets for deformable convolution. Furthermore, an attention aggregation strategy oriented at preference is designed for temporal information fusion. The strategy synthesizes global information of inputs to modulate features for effective fusion. Extensive experiments have proved that the proposed method achieves great performance in terms of quantitative performance and qualitative effect.
暂无评论