Neuronal reconstruction–a process that transforms image volumes into 3D geometries and skeletons of cells–bottlenecks the study of brain function, connectomics and pathology. Domain scientists need exact a...
详细信息
This paper introduces a new distributed Adapt-then-Combine (ATC) diffusion algorithm for cooperative tracking of an unknown state vector that evolves on the unit hypersphere. The adapt step is implemented for a genera...
详细信息
ISBN:
(纸本)9781728176055
This paper introduces a new distributed Adapt-then-Combine (ATC) diffusion algorithm for cooperative tracking of an unknown state vector that evolves on the unit hypersphere. The adapt step is implemented for a general nonlinear observation model and a dynamic state model defined on the hypersphere using a marginal particle filter (PF). The combine step in turn uses parallel transport to build Gaussian parametric approximations on a common tangent space to the spherical manifold. Performance results are compared to those of competing linear diffusion Extended Kalman Filters and non-cooperative PFs.
Extracting text from images is essential in imageprocessing and computer vision, with applications in document digitization and automated text recognition. This paper reviews various text extraction techniques, categ...
详细信息
ISBN:
(数字)9798331533663
ISBN:
(纸本)9798331533670
Extracting text from images is essential in imageprocessing and computer vision, with applications in document digitization and automated text recognition. This paper reviews various text extraction techniques, categorized into thresholding, rough set and fuzzy set methods, clustering, edge detection, and machine learning. Thresholding techniques such as Gaussian, Otsu, adaptive, and double-edge methods are explored. Rough set and fuzzy set methods, which handle uncertainty in image data and improve text segmentation, are reviewed. Clustering techniques, such as K-means and density-based methods, are studied for their effectiveness in grouping pixels for text isolation. Edge detection techniques, including Sobel, Roberts, Canny, Morphological Component Analysis (MCA), and Laplacian, are examined for their role in enhancing text boundary identification. Machine learning approaches, such as Support Vector Machines (SVM), Contrastive Language-image Pre-training (CLIP), Hidden Markov Models (HMM), and hybrid BiLSTM-CNN models, are analyzed for their ability to improve accuracy in noisy environments. This review compares these techniques, highlighting their strengths, challenges, and applications for text extraction.
With the rapid development of autonomous driving technology, LiDAR (Light Detection and Ranging) has gradually become a mainstream tool for vehicle positioning and navigation. LiDAR odometry relies on the processing a...
详细信息
ISBN:
(数字)9798331532598
ISBN:
(纸本)9798331532604
With the rapid development of autonomous driving technology, LiDAR (Light Detection and Ranging) has gradually become a mainstream tool for vehicle positioning and navigation. LiDAR odometry relies on the processing and feature extraction of 3D point cloud data to achieve high-precision environmental perception and path planning. Traditional computational methods face significant bottlenecks when handling these data, particularly in terms of real-time processing and hardware acceleration. To address this, this paper proposes a hardware implementation scheme for 3D LiDAR odometry accelerated by FPGA, with a focus on spherical projection and distance image-based feature extraction ***, we designed and implemented a multi-core parallel FPGA hardware architecture, which includes two main modules: the multi-core parallel spherical projection hardware module and the normal vector feature extraction module. By optimizing the hardware architecture and leveraging FPGA’s parallelprocessing capabilities, we accelerated the computation process, significantly improving the data processing *** results show that, compared to traditional CPU processingmethods, the FPGA-based acceleration scheme demonstrates significant acceleration in both spherical projection and feature extraction, with processing time greatly reduced. By comparing the processing times across different platforms, we validated the potential of FPGA in efficiently processing LiDAR data, providing an effective solution for hardware acceleration in 3D LiDAR data processing for autonomous driving *** findings of this paper offer a reference path for future FPGA-based hardware acceleration implementations in autonomous driving systems and open new directions for hardware optimization and applications in related fields.
Deep learning-based multi-exposure image fusion (MEF) methods have demonstrated robust performance. However, these methods require considerable computational resources and energy, which greatly limits their practical ...
详细信息
ISBN:
(数字)9798331529192
ISBN:
(纸本)9798331529208
Deep learning-based multi-exposure image fusion (MEF) methods have demonstrated robust performance. However, these methods require considerable computational resources and energy, which greatly limits their practical deployment. To address this issue, we propose a lightweight, dual-stage MEF method, termed LDMEF. By effectively deploying on field-programmable gate array (FPGA), this method significantly enhances its range of applications and flexibility. Specifically, in the initial stage, LDMEF preprocesses the input sequences by leveraging the parallelprocessing capabilities of FPGA to compute a preliminary image through pixel-wise addition and averaging, ensuring both simple and rapid execution. Subsequently, in the second stage, our proposed method incorporates depthwise separable convolution with the preliminary image to facilitate a lightweight network that is both straightforward to deploy and simple in design. This network meticulously fine-tunes the preliminary image at the pixel level, achieving high-quality fusion results. Extensive evaluations on publicly available datasets confirm that LDMEF not only achieves remarkable results but also outperforms many GPU-based learning MEF methods.
In this manuscript, we investigate the dynamics of Memristor Cellular Nonlinear Networks, focusing on the complex behaviors of 2-terminal locally active volatile threshold switches, known as volatile memristors, using...
详细信息
ISBN:
(数字)9798350351927
ISBN:
(纸本)9798350351934
In this manuscript, we investigate the dynamics of Memristor Cellular Nonlinear Networks, focusing on the complex behaviors of 2-terminal locally active volatile threshold switches, known as volatile memristors, using a circuit-theoretic approach. We show that a cell within the array equipped with an NbO
2
-based volatile threshold switch can exhibit both oscillatory and static dynamics depending on the parameters of the Memristor Cellular Nonlinear Network. We utilize the latter to design an M-CNN for imageprocessing that performs edge detection on binary input images.
With serverless computing offering more efficient and cost-effective application deployment, the diversity of serverless platforms presents challenges to users, including platform lock-in and costly migration. Moreove...
详细信息
Three-dimensional (3D) reconstruction in cryo-electron tomography (cryo-ET) plays an important role in studying in situ biological macromolecular structures at the nanometer level. Owing to limited tilt angle, 3D reco...
详细信息
ISBN:
(数字)9789819947492
ISBN:
(纸本)9789819947485;9789819947492
Three-dimensional (3D) reconstruction in cryo-electron tomography (cryo-ET) plays an important role in studying in situ biological macromolecular structures at the nanometer level. Owing to limited tilt angle, 3D reconstruction of cryo-ET always suffers from a "missing wedge" problem which causes severe accuracy degradation. Multi-tilt reconstruction is an effective method to reduce artifacts and suppress the effect of the missing wedge. As the number of tilt series increases, large size data causes high computation and huge memory overhead. Limited by the memory, multi-tilt reconstruction cannot be performed in parallel on GPUs, especially when the image size reaches 1 K, 2 K, or even larger. To optimize large-scale multi-tilt reconstruction of cryo-ET, we propose a newGPU-based large-scale multi-tilt tomographic reconstruction algorithm (GMSIRT). Furthermore, we design a two-level data partition strategy in GM-SIRT to greatly reduce the memory required in the whole reconstructing process. Experimental results show that the performance of the GM-SIRT algorithm has been significantly improved compared with DM-SIRT, the distributed multi-tilt reconstruction algorithm on the CPU cluster. The acceleration ratio is over 300%, and the memory requirement only decreases to one-third of DM-SIRT when the image size reaches 2 K.
Inverse halftoning is a technique used to recover realistic images from ancient prints (e.g., photographs, newspapers, books). The rise of deep learning has led to the gradual incorporation of neural network designs i...
详细信息
ISBN:
(纸本)9781713871088
Inverse halftoning is a technique used to recover realistic images from ancient prints (e.g., photographs, newspapers, books). The rise of deep learning has led to the gradual incorporation of neural network designs into inverse halftoning methods. Most of existing inverse halftoning approaches adopt the U-net architecture, which uses an encoder to encode halftone prints, followed by a decoder for image reconstruction. However, the mainstream supervised learning paradigm with element-wise regression commonly adopted in U-net based methods has poor generalization ability in practical applications. Specifically, when there is a large gap between the dithering patterns of the training and testing halftones, the reconstructed continuous-tone images have obvious artifacts. This is an important issue in practical applications, since the algorithms for generating halftones are ever-evolving. Even for the same algorithm, different parameter choices will result in different halftone dithering patterns. In this paper, we propose the first generative halftoning method in the literature, which regards the black pixels in halftones as physically moving particles, and makes the randomly distributed particles move under some certain guidance through reverse diffusion process, so as to obtain desired halftone patterns. In particular, we propose a Conditional Diffusion model for image Halftoning (CDH), which consists of a halftone dithering process and an inverse halftoning process. By changing the initial state of the diffusion model, our method can generate visually plausible halftones with different dithering patterns under the condition of image gray level and Laplacian prior. To avoid introducing redundant patterns and undesired artifacts, we propose a meta-halftone guided network to incorporate blue noise guidance in the diffusion process. In this way, halftone images subject to more diverse distributions are fed into the inverse halftoning model, which helps the model to lear
This paper addresses the very challenging problem of online task-free continual learning in which a sequence of new tasks is learned from non-stationary data using each sample only once for training and without knowle...
详细信息
ISBN:
(纸本)9783031198052;9783031198069
This paper addresses the very challenging problem of online task-free continual learning in which a sequence of new tasks is learned from non-stationary data using each sample only once for training and without knowledge of task boundaries. We propose in this paper an efficient semi-distributed associative memory algorithm called Dynamic Sparse distributed Memory (DSDM) where learning and evaluating can be carried out at any point of time. DSDM evolves dynamically and continually modeling the distribution of any non-stationary data stream. DSDM relies on locally distributed, but only partially overlapping clusters of representations to effectively eliminate catastrophic forgetting, while at the same time, maintaining the generalization capacities of distributed networks. In addition, a local density-based pruning technique is used to control the network's memory footprint. DSDM significantly outperforms state-of-the-art continual learning methods on different image classification baselines, even in a low data regime. Code is publicly available: https://***/Julien-pour/Dynamic-Sparse-distributed-Memory.
暂无评论