The human capability to feel the musical downbeat and perceive the beats within a unit of time is intuitive and insensitive to the varying tempo of music audio signals. Yet, this mechanism is not straightforward for a...
详细信息
ISBN:
(数字)9789819916450
ISBN:
(纸本)9789819916443;9789819916450
The human capability to feel the musical downbeat and perceive the beats within a unit of time is intuitive and insensitive to the varying tempo of music audio signals. Yet, this mechanism is not straightforward for automated systems and requires further scientific depiction. As automatic music analysis is a crucial step for music structure discovery and music recommendation, downbeat tracking and varying tempo induction have been persistent challenges in the music information retrieval field. This paper introduces an architecture based on bidirectional long-short-term memory artificial recurrent neural networks to distinguish downbeat instants, supported by a dynamic Bayesian network to jointly infer the tempo estimation and correct the estimated downbeat locations according to the optimal solution. The proposed system outperforms existing algorithms with an achieved F1 score of 88.0%.
Document image classification has gained extensive attention due to the rising number and types of scanned documents. Multi-modal architectures, processingimage and text simultaneously, leverage the strengths of each...
详细信息
In a great number of applications, the goal is to infer an unknown image from a small number of noisy measurements collected from a known and possibly nonlinear forward model describing certain sensing or imaging moda...
Nowadays, technology for biometric authentication technologies is actively implemented, users can usually authenticate themselves by their face, voice, fingerprint, vascular bed of the finger, palm, iris. Using images...
详细信息
In recent years, Deep Neural Networks (DNNs) approaches have outperformed traditional techniques for several computer vision problems. This has been made possible by the increase of computational resources represented...
详细信息
ISBN:
(纸本)9798350373981;9798350373974
In recent years, Deep Neural Networks (DNNs) approaches have outperformed traditional techniques for several computer vision problems. This has been made possible by the increase of computational resources represented by Graphical processing Units (GPU) that allow training using large datasets and the availability of deep learning accelerators for inference. On the other hand, the attitude determination accuracy requirements for spacecraft are increasing. The most accurate attitude determination sensor for spacecraft is the so-called star sensor or star tracker. With the increase in lowcost satellite platforms such as CubeSats, research into the improvement of star sensor accuracy for low-power and low-cost sensor architectures remains a relevant subject. In this context, we examine several methods for noise reduction and star detection for improving centroiding performance. More specifically, an efficient and robust denoising method for star images using an Auto-Encoder (AE) is proposed. This method enhances the image quality for systems sensitive to noise. Furthermore, an accurate and lightweight algorithm based on an existing YOLO (You Only Look Once) architecture is proposed to detect the location of stars in the image. In this work, the YOLO bounding boxes are used to describe the space region around the stars. Subsequently, the star centroid within the bounding box is computed using the COG (Center Of Gravity) method. This method removes the need for centroiding algorithms sliding over the entire image area. An extensive comparison of the proposed denoising technique with other traditional filters confirms that the proposed method resists all noise models and reconstructs well the corrupted images. Experiments show that the proposed YOLO-based star detector achieves high accuracy with a lightweight architecture without any extra latency.
Utility-particular photo and video processing strategies for protection and surveillance are a set of algorithms and techniques used to procedure and examine pictures and motion pictures captured by protection and sur...
详细信息
Object recognition in photographs has only been possible with the advent of modern deep learning (DL) techniques. Following its success in other fields, DL techniques are increasingly being used to a broad variety of ...
详细信息
In the high-speed mobile environment supported by the fifth-generation mobile communication technology, higher vehicle speeds, more frequent switching and wider bandwidth make the design of high-speed mobile communica...
详细信息
Entirely common sleep disorder is the dangerous wakefulness (Apnea) of breathing during snoring. In this study, we explore the possibility of how medical imageprocessing can be useful and accurate method for sleep di...
详细信息
In this article a new data transmission protocol between embedded computers and the fog computing environment for imageprocessing is considered. The possibilities of using fog environment for intelligent processing o...
详细信息
ISBN:
(纸本)9781665468282
In this article a new data transmission protocol between embedded computers and the fog computing environment for imageprocessing is considered. The possibilities of using fog environment for intelligent processing of video are revealed. The problems of transmission protocols are indicated. A comparison of existing video processingsystems is given. The place of embedded computers in intelligent video surveillance systems and ways of upgrading these systems are considered. Data from the devices is collected using embedded computers and visualized using IoT technologies. The developed data transmission protocol allows processingimages in the fog and/or cloud for intelligent video surveillance systems. It assumes packet data transmission of series of frames with additional information that are processed using machine learning algorithms and neural networks. The effectiveness of the new systems is shown. As an example, the problem of processingvideo coming from cameras used in the subway to determine damage of the escalator tape and steps is considered. systems for imageprocessing that implement the proposed protocol can be used not only in the subway, but also in many other areas where Internet of things technologies are supported.
暂无评论