Short-lag spatial coherence (SLSC) beamforming has the potential to improve the diagnostic power of a multitude of ultrasound imaging techniques. One challenge for advanced real-time implementation is repeated correla...
详细信息
Short-lag spatial coherence (SLSC) beamforming has the potential to improve the diagnostic power of a multitude of ultrasound imaging techniques. One challenge for advanced real-time implementation is repeated correlation calculations. To address this challenge, this paper introduces CohereNet - a novel deep neural network architecture that estimates the coherence function in efforts to bypass the repeated correlation calculations required for SLSC imaging. The network was trained and evaluated using in vivo breast data, demonstrating similar contrast, CNR, SNR, and GCNR with an average correlation between the original image and the DNN image of 0.93, and improved computational speed (i.e., a factor of 3.4 improvement) when compared to the offline implementations. In addition, the model is generalizable across multiple tissue types, probe geometries, and ultrasound systems. These results are promising for the use of deep learning architectures as a replacement for correlation estimation in multiple areas of coherence-based ultrasound imaging.
Human detection is an important task for several practical applications that require high-speed processing with good detection accuracy. This paper proposes a high-speed implementation of Informed-Filtersthat shows ex...
详细信息
ISBN:
(数字)9783319757865
ISBN:
(纸本)9783319757865;9783319757858
Human detection is an important task for several practical applications that require high-speed processing with good detection accuracy. This paper proposes a high-speed implementation of Informed-Filtersthat shows excellent accuracy in human detection. Our implementation reduces memory access during feature calculation and realizes efficient computation on an NVIDIA GPU where a thread is allocated to a detection sub-window. Experimental results using top-view images considering surveillance from UAVs showed that the processing speed was about 100 fps for 2560x1352 images on an NVIDIA 980Ti GPU, whereas it was 5.4 fps on an Intel Xeon 2.30GHz CPU.
Visual Question Answering (VQA) is a task that connects the fields of computer Vision and Natural Language processing. Taking as input an image I and a natural language question Q about I, a VQA model must be able to ...
详细信息
ISBN:
(纸本)9781450358675
Visual Question Answering (VQA) is a task that connects the fields of computer Vision and Natural Language processing. Taking as input an image I and a natural language question Q about I, a VQA model must be able to produce a coherent answer R (also in natural language) to Q. A particular type of visual question is one in which the question is binary (i.e., a question whose answer belongs to the set {yes, no}). Currently, deep neural networks correspond to the state of the art technique for training of VQA models. Despite its success, the application of neural networks to the VQA task requires a very large amount of data in order to produce models with adequate precision. Datasets currently used for the training of VQA models are the result of laborious manual labeling processes (i.e., made by humans). This context makes relevant the study of approaches to augment these datasets in order to train more accurate prediction models. This paper describes a crowdsourcing tool which can be used in a collaborative manner to augment an existing VQA dataset for binary questions. Our tool actively integrates candidate items from an external data source in order to optimize the selection of queries to be presented to curators.
Multiprocessor system-on-chip (MPSoC) have been successfully used to speed up the processing of parallel, computing-intensive applications. Many domains could benefit from such processing power, including the one of r...
详细信息
ISBN:
(数字)9781728142685
ISBN:
(纸本)9781728142692
Multiprocessor system-on-chip (MPSoC) have been successfully used to speed up the processing of parallel, computing-intensive applications. Many domains could benefit from such processing power, including the one of robotics, in which several tasks, from imageprocessing to odometry, may compete for resources at the same time as they cooperate to achieve system's goals. This paper presents the preliminary results on the integration of an MPSoC to a robotics environment through co-simulation. We developed wrappers to permit two simulators to communicate via UDP networks; While one simulator carries out environmental physics, the other one simulates the underlying MPSoC platform. We provide a proof-of-concept based on a synthetic random-walk application along with a performance analysis of the environment. With this environment, we move towards the evaluation of non-functional requirements of robotics applications, such as energy consumption.
Recent advances in deep learning point towards the use of computer vision systems based on Deep Neural Networks (DNNs). However, these network architectures are optimized to be executed in specialized hardware, such a...
详细信息
ISBN:
(数字)9781728142685
ISBN:
(纸本)9781728142692
Recent advances in deep learning point towards the use of computer vision systems based on Deep Neural Networks (DNNs). However, these network architectures are optimized to be executed in specialized hardware, such as in computers with graphicsprocessing Units (GPU). Such hardware is rarely available in embedded computers, for instance, those used by mobile robots, so alternatives must be studied in order to guarantee that mobile systems may still benefit from the applications of deep learning. In this work, we investigate the performance of a vision system for ball detection, based on different configurations of the MobileNet Convolutional Neural Network architecture, under a constrained hardware scenario. By gradually reducing the input size and the number of parameters that compose the neural network and comparing their inference time in an Intel NUC Core i7 mini-PC, embedded in a humanoid soccer robot, we have found acceptable values for the width and resolution multipliers to be used in our soccer ball detection system during a robot-soccer match.
In this paper we describe two general parametric, non symmetric 3×3 gradient models. Equations for calculating the coefficients of matrices of gradients are presented. These models for generating gradients in x-d...
详细信息
In discrete and digital geometry, rotations with the composition of translations have been measured and examined carefully on the square and the hexagonal grids. The translation has never been considered individually ...
详细信息
ISBN:
(纸本)9781538647721
In discrete and digital geometry, rotations with the composition of translations have been measured and examined carefully on the square and the hexagonal grids. The translation has never been considered individually because it obviously leads to the isometric translation on these grids. However, the triangular grid is not a point lattice, thus, it is worth to consider the translation itself. In this article, translations on the triangular grid are investigated and the vectors of bijective and non-bijective translations are specified.
Coherent rendering in augmented reality deals with synthesizing virtual content that seamlessly blends in with the real content. Unfortunately, capturing or modeling every real aspect in the virtual rendering process ...
详细信息
Coherent rendering in augmented reality deals with synthesizing virtual content that seamlessly blends in with the real content. Unfortunately, capturing or modeling every real aspect in the virtual rendering process is often unfeasible or too expensive. We present a post-processing method that improves the look of rendered overlays in a dental virtual try-on application. We combine the original frame and the default rendered frame in an autoencoder neural network in order to obtain a more natural output, inspired by artistic style transfer research. Specifically, we apply the original frame as style on the rendered frame as content, repeating the process with each new pair of frames. Our method requires only a single forward pass, our shallow architecture ensures fast execution, and our internal feedback loop inherently enforces temporal consistency.
Recent years have seen a surge in the popularity of Field-Programmable Gate Arrays (FPGAs). Programmers can use them to develop high-performance systems that are not only efficient in time, but also in energy. Yet, pr...
详细信息
ISBN:
(纸本)9781538677698
Recent years have seen a surge in the popularity of Field-Programmable Gate Arrays (FPGAs). Programmers can use them to develop high-performance systems that are not only efficient in time, but also in energy. Yet, programming FPGAs remains a difficult task. Even though there exist today OpenCL interfaces to synthesize such hardware, higher-level programming languages, such as Java, C# or Python remain distant from them. In this paper, we describe a compiler, and its supporting runtime environment, that reduces this distance, translating functional code written in Java to the Intel HARP platform. Thus, we bring two contributions. First, the insight that a functional-style library is a good starting point to bridge the gap between high-level programming idioms and FPGAs. Second, the implementation of this system itself, including the compiler, its intermediate representation, and all the runtime support necessary to shield developers from the task of transferring data back and forth between the host CPU and the accelerator. To demonstrate the effectiveness of our system, we have used it to implement different benchmarks, used in imageprocessing and data-mining. For large inputs, we can observe consistent 20x speedups over the Java Virtual Machine across all our benchmarks. Depending on the target function that we compile, this speedup can achieve 280x.
暂无评论