Most existing optimization methods for neural architecture search (NAS), including evolutionary algorithms, reinforcement learning and gradient-based approaches, have not employed memory strategies explicitly, which m...
详细信息
To perform advanced surveillance, Unmanned Aerial Vehicles (UAVs) require the execution of edge-assisted computer vision (CV) tasks. In multi-hop UAV networks, the successful transmission of these tasks to the edge is...
详细信息
In applications such as digital signal processing and machine learning, the accuracy of internal operations is not so strict due to the limitation of human perception. Approximate computing has been focused as an effe...
详细信息
The vision system is more perfect than any artificial digital imagesignal processing system. Bionics is a good way to improve the technology of digital image processing. Sparse coding is an artificial neural network ...
详细信息
In matrix recovery, an unknown matrix can be reconstructed by a small number of limited and noisy measurements. Deep learning-based methods, such as deep generative models, pro-vide stronger priors that can serve to m...
详细信息
In matrix recovery, an unknown matrix can be reconstructed by a small number of limited and noisy measurements. Deep learning-based methods, such as deep generative models, pro-vide stronger priors that can serve to mitigate the pressure of sampling during image recovery. But such methods require that the recovered data be limited to the scope of the generator, otherwise it will lead to large recovery error. To circumvent this problem, in this paper, a framework for matrix recovery from limited measurements is proposed, which employs low rank approximation to characterize the deviation of generator, referred to as Low-Rank-Gen. Theoretically, we propose Matrix Set-Restricted Eigenvalue Condition (M-S-REC), and further prove the existence of decoders and upper bound of reconstruction error using certain number of measurements corresponding to such decoder. Empirically, we observe consistent improvements in reconstruction accuracy, PSNR index over competing approaches.
The utilization of data analytics to gain insights into the game of basketball has seen a remarkable surge in the past decade. Leagues such as the National Basketball Association are continuously exploring innovative ...
详细信息
ISBN:
(数字)9798350330649
ISBN:
(纸本)9798350330656
The utilization of data analytics to gain insights into the game of basketball has seen a remarkable surge in the past decade. Leagues such as the National Basketball Association are continuously exploring innovative methods to analyze game data, an approach that has significantly influenced the dynamics of the game. But to perform these analyses, a growing amount of data is needed, which is traditionally annotated by humans. This work proposes a 3-stage system able to automatically acquire relevant basketball game data from a broadcast video. The first stage is an object detector combined with a tracking algorithm to extract the main elements present in a basketball game video. Then, the players' visual information is analyzed to identify the players based on pixel color analysis and number recognition. Finally, a statistics generation algorithm assigns the game events to the corresponding player and team, so that the system can be used as an aid for box score annotation in major leagues, low-cost annotation in amateur games, or in-depth game video analysis.
Determining the origin of a digital image or video, namely device source identification, is widely used in courtroom evidence and copyright protection. Currently, device source identification primarily focuses on imag...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Determining the origin of a digital image or video, namely device source identification, is widely used in courtroom evidence and copyright protection. Currently, device source identification primarily focuses on images captured using single camera with default settings. However, with the advancement of imaging technology, there is a large number of smartphones equipped with multiple cameras and various shooting modes for acquiring images, which may pose a significant challenge to device source identification. Therefore, to assess the performance of image source identification algorithm for modern smartphones and promote further research, it is crucial to build a dataset of image and video captured by modern smartphones. In this paper, we present a large-scale image and video dataset for forensic analysis, ForensiCam-215K. The dataset includes over 215K media contents captured by 130 modern smartphones of 10 major brands. We used the latest equipment to capture images from the main, wide-angle, and telephoto cameras in six different shooting modes, and the media were collected under a strictly controlled procedure to reduce the bias caused by differences in the acquisition process between different devices. Additionally, we used the Photo Response Non-Uniformity (PRNU) method to perform device source identification tests on the dataset. The results indicate that device source identification is a challenging task especially for images and videos captured by smartphones with multiple cameras and various shooting modes. The dataset will be released as open-source and freely available for use by the multimedia forensics research community at https://***/dswdsw21072/ForensiCam-215K.
In this paper we propose an optimized version of the PCA algorithm by using genetic algorithms and the KNN technique with applications in the classification of image classes. The algorithm are developed by optimizing ...
In this paper we propose an optimized version of the PCA algorithm by using genetic algorithms and the KNN technique with applications in the classification of image classes. The algorithm are developed by optimizing the images projection stage based on the eigenvectors that ensure the extraction of essential features from the data. In that stage, the optimal value regarding the size of the projection space of the initial images is determined by using a genetic algorithm and and the improved KNN algorithm regarding the key parameter represented by the number of nearest neighbors used in the testing stage. For the validation and testing of the developed algorithms, the classification of face image classes corresponding to several people is considered. The proposed algorithm includes several versions, analyzed in the experimental study, depending on the similarity measure used (the Euclidian distance, Manhattan distance, Mahalanobis distance, Minkowski distance and cosine) to determine the nearest neighbors to the KNN algorithm, as well as based on the crossover recombination method chosen at the level of binary genes with a crossover point or more many crossing points. The experimental study on the methods established in the work attests good performance through the percentages of correctness, resulting in applications for recognizing people through facial images.
Maritime recycling plays a significant role in reusing components from spacecraft. However, the current methods for detecting and recovering sunken spacecraft wrecks suffer from inaccurate detection and low efficiency...
Maritime recycling plays a significant role in reusing components from spacecraft. However, the current methods for detecting and recovering sunken spacecraft wrecks suffer from inaccurate detection and low efficiency. To improve the detection effectiveness of rocket wrecks, this paper proposes a method based on underwater optical images which leverage the high-resolution features of images. To address the issue of low accuracy and efficiency in detecting rocket wrecks, a lightweight deep neural network detection model is introduced to enhance accuracy while reducing the computational load and increasing efficiency. Finally, through quantitative and qualitative analyses, our proposed method achieves a detection accuracy of 96.0% with low computational power, which lays a solid foundation for spacecraft recycling.
JPEG2000 is a popular still image compression standard, which has excellent compression performance in the field of still image compression. At present, it is widely believed that MQ arithmetic encoder has become the ...
详细信息
ISBN:
(数字)9798331541460
ISBN:
(纸本)9798331541477
JPEG2000 is a popular still image compression standard, which has excellent compression performance in the field of still image compression. At present, it is widely believed that MQ arithmetic encoder has become the most critical speed bottleneck of JPEG2000 algorithm due to the restriction of the serial execution mode of the algorithm itself. The MQ encoding is serial in nature, which is due to the high dependence between adjacent Cx-D pairs which severely affects its processing speed. In order to solve this problem, this paper proposes a multi-context MQ coding hardware architecture based on a partially parallel coding strategy. This architecture can process up to three Cx-D pairs in one clock cycle, which greatly improves the processing speed of the MQ arithmetic encoder under the existing architecture. In terms of overall performance, the encoder designed in this paper can process 2.64 Cx-D pairs per cycle on average, and 743.16 Cx-D pairs per second under TSMC90nm technology. The architecture is only 208399.66 units in area while achieving high throughput. The proposed MQ encoder provides a useful reference for hardware implementation of high performance algorithmic encoders in JPEG2000 and other image compression standards. At the same time, we comprehensively analyze the possible output scenarios of multi-bytecode streams caused by multi-context processing, and propose a processing structure based on additional pipeline, which greatly reduces the number of module fan-out and FIFOs.
暂无评论