This paper proposes a framework for real-time tracking of objects in real scene using computer vision technique implemented on a mobile application for news reporting. This automated reporting system is intended to co...
详细信息
ISBN:
(纸本)9781538675250
This paper proposes a framework for real-time tracking of objects in real scene using computer vision technique implemented on a mobile application for news reporting. This automated reporting system is intended to complement news reporters in reporting real-time scenario especially in dangerous environment such as disaster situation, fire, explosion or war zones. The system comprises assembly of some functionalities and algorithms to successfully build and run function by converting scene images into paragraphing and voice output to describe the scenarios in real-time. A variety of functions such as object motion detection, scene recognition, emotional recognition and distance detection are presented so that paragraphing and voice generation can be more accurate and close in meaning to real-time human reporting situations. Since the proposed system can facilitate news reporting in real turbulent situations, the burden of news reporting can be reduced under such dangerous circumstances. This work briefly reviews the rudimentary concepts of imageprocessing and computer vision that serve as the components of the proposed real-time automated reporting system and describes how these various compelling applications are coupled and work together. The work also outlines the choice of design for effectiveness and efficiency of such reporting systems.
Advances in modern medical imaging technologies such as X-Ray, Computed Tomography (CT), Ultra Sound (US) imaging, Magnetic Resonance Imaging (MRI), Positron emission tomography (PET) and Single Photon Emission Comput...
详细信息
Many modern mobile applications incorporate face detection and landmarking into their systems, such as Snapchat, beauty filters and camera auto-focusing systems, where they implement regression based machine learning ...
详细信息
ISBN:
(纸本)9789897583063
Many modern mobile applications incorporate face detection and landmarking into their systems, such as Snapchat, beauty filters and camera auto-focusing systems, where they implement regression based machine learning algorithms for accurate face landmark detection, allowing the manipulation of facial appearance. The mobile applications that incorporate machine learning have to overcome issues such as lighting, occlusion, camera quality and false detections. A solution could be provided through the resurgence of deep learning with neural networks, as they are showing significant improvements in accuracy and reliability in comparison to the state-of-the-art machine learning. Here, we demonstrate the process by using trained networks on mobile devices and review its effectiveness. We also compare the effects of employing max-pooling layers, as an efficient method to reduce the required processing power. We compared network with 3 different amounts of max-pooling layer and ported one to the mobile device, the other two could not be ported due to memory restrictions. We will be releasing all code to build, train and use the model in a mobile application. The results show that despite the limited processing capability of mobile devices, neural networks can be used for difficult challenges while still working in real-time. We show a network running on a mobile device on a live data stream and give a recommendation on the structure of the network.
This paper describes the solution of problem of visualization of changes in graph model of internal representation of programs in order to reflect processes which happens during calculation of programs or processing w...
详细信息
In this paper, the problem of moving target indication (MTI) using synthetic aperture radar (SAR) is considered. The focus of the article is the tangential component of velocity. Two tangential velocity MTI algorithms...
In this paper, the problem of moving target indication (MTI) using synthetic aperture radar (SAR) is considered. The focus of the article is the tangential component of velocity. Two tangential velocity MTI algorithms are considered. The first algorithm uses two apertures with various synthetic time of the radar image (AVST algorithm), and the second uses two apertures displaced along trajectory (ADAT algorithm). The structure of the MTI system based on the analysis of phase and amplitude radar images is considered. For S band and X band SAR, the phase change in the trajectory signal of a moving target, the effects of shift and bifurcation of target responses on the radar image are analyzed in detail. It was found that the AVST algorithm has a small working range of unambiguous velocity estimate (up to ±10 m/s). It is shown that the ADAT algorithm has a higher quality of work in a wide velocity range and can effectively suppress the signals of stationary objects by 20...30 dB. The obtained characteristics allow us to make demands on the parameters of space-borne systems for remote sensing of the Earth and processingsystems.
The monitoring of patients within a natural, home environment is important in order to close knowledge gaps in the treatment and care of neurodegenerative diseases, such as quantifying the daily fluctuation of Parkins...
详细信息
ISBN:
(纸本)9781538632277
The monitoring of patients within a natural, home environment is important in order to close knowledge gaps in the treatment and care of neurodegenerative diseases, such as quantifying the daily fluctuation of Parkinson's patients' symptoms. The combination of machine learning algorithms and wearable sensors for gait analysis is becoming capable of achieving this. However, these algorithms require large, labelled, realistic datasets for training. Most systems used as a ground truth for labelling are restricted to the laboratory environment, as well as being large and expensive. We propose a study design for a realistic activity monitoring dataset, collected with inertial measurement units, pressure insoles and cameras. It is not restricted by a fixed location or capture volume and still enables the labelling of gait phases or, where non-gait movement such as jumping occur: on-the-ground, off-the-ground phases. Additionally, this paper proposes a smart annotation tool which reduces annotation cost by more than 80%. This smart annotation is based on edge detection within the pressure sensor signal. The tool also enables annotators to perform assisted correction of these labels in a post-processing step. This system enables the collection and labelling of large, fairly realistic datasets where 93% of the automatically generated labels are correct and only an additional 10% need to be inserted manually. Our tool and protocol, as a whole, will be useful for efficiently collecting the large datasets needed for training and validation of algorithms capable of cyclic human motion analysis in natural environments.
In Bayesian theory, the maximum posterior estimator uses prior information to estimate the noise in the machine learning model by adding the regularization term. The regularization terms L 1 and L 2 correspond to La...
详细信息
In Bayesian theory, the maximum posterior estimator uses prior information to estimate the noise in the machine learning model by adding the regularization term. The regularization terms L 1 and L 2 correspond to Laplacian prior and Guassian prior, respectively. In existing deep learning models, in order to use the gradient descent optimization algorithm and achieve good results, most models take L 2 regularization as the regularization term of the network model to fit the complex Guassian noise. However in practice, the Laplace noise and the Guassian noise are both considered as data noise. For multi-layer perceptrons, the difficulty caused by adding L 1 and L 2 into the optimization function of the network is solved by proposing an ensemble model for error modeling through adopting the divide and conquer strategy. First, several base learners are trained to fit different noise distributions of data, then the final results can be obtained by taking the results of each base leaner as new data to train a meta leaner, and get the final results. Among them, coordinate regression method is used to solve L 1 loss, while the pseudo-inverse learning algorithm is employed to solve L 2 loss. Both methods are nongradient optimization algorithms. The comparison results of the model on several data sets show that the proposed ensemble model achieves better performance.
The use of a sparse crystal setting would reduce the cost of the PET scanner and has advantages such as less RF shielding in PET/MR. It also allows a longer axial field of view (FOV) using the same crystal volume. In ...
详细信息
ISBN:
(纸本)9781538684948
The use of a sparse crystal setting would reduce the cost of the PET scanner and has advantages such as less RF shielding in PET/MR. It also allows a longer axial field of view (FOV) using the same crystal volume. In this paper, the sensitivities of the coincidence events of PET systems with the sparse crystal configuration, thin crystal setting, and the conventional design using a fixed total crystal volume were analytically estimated. The sinograms of a sparse system (with 50% crystal removed and fixed axial FOV) were simulated using patient data. Reconstruction algorithms were developed by modeling the effects of reduced crystals in the system matrix. A convolutional neural network (CNN) based noise reduction approach was used for post-processing. A total of 14 patient data were included and were truncated to 3 minutes scan for consistency. Leave-one-out cross- validation was used for evaluation purpose. A patch based data input/output was used for model training to increase the number of training samples. images reconstructed using OSEM followed by Gaussian denoising was also used as a comparison. The percentage summed square difference (SSD) between images of sparse crystal configuration and non-sparse systems were used for quantitative evaluation. When using the same total volume of crystals, the difference of sensitivity at the center of FOV was within 10% among three different settings, with the rank from highest to lowest being the thin detector, sparse detector, and conventional detector. When using the same axial FOV, reconstructed images of the sparse crystal configuration showed increased noise due to reduced sensitivity. The percentage SSD for image processed with the Gaussian filter was 30% on average and was reduced to 16% with CNN on average. The results show with the same amount of crystal, the use of sparse crystal configuration provides a slightly larger sensitivity and much larger axial FOV. CNN processed images was able to partially recover los
Currently, due to different reasons, the road accidents are increasing. Road accidents are prone to number human deaths. There are different reasons which lead to road accidents, but drivers fatigue or distraction is ...
详细信息
Breast cancer accounts for 16% of all cancers among females. Current early detection methods are expensive or computationally complex and thus unsuitable for developing countries. For this reason, a real-time fully au...
详细信息
暂无评论