While an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and imageprocessing/generation, AI in medicine has been much slower to be applied in real-world c...
详细信息
While an extremely rich research field, compared to other applications of AI such as natural language processing (NLP) and imageprocessing/generation, AI in medicine has been much slower to be applied in real-world clinical settings. Often the stakes of failure are more dire, the access of private and proprietary data more costly, and the burden of proof required by expert clinicians is much higher. Beyond these barriers, the often typical data-driven approach towards validation is interrupted by a need for expertise to analyze results. Whereas the results of a trained imagenet or machine translation model are easily verified by a computational researcher, analysis in medicine can be much more multi-disciplinary demanding. AI in medicine is motivated by a great demand for progress in health-care, but an even greater responsibility for high accuracy, model transparency, and expert validation. This thesis develops machine and deep learning techniques for medical image enhancement, patient outcome prognosis, and minimally invasive robotic surgery awareness and augmentation. Each of the works presented were undertaken in direct collaboration with medical domain experts, and the efforts could not have been completed without them. Pursuing medical image enhancement we worked with radiologists, neuroscientists and a neurosurgeon. In patient outcome prognosis we worked with clinical neuropsychologists and a cardiovascular surgeon. For robotic surgery we worked with surgical residents and a surgeon expert in minimally invasive surgery. Each of these collaborations guided priorities for problem and model design, analysis, and long-term objectives that ground this thesis as a concerted effort towards clinically actionable medical AI. The contributions of this thesis focus on three specific medical domains. (1) Deep learning for medical brain scans: developed processing pipelines and deep learning models for image annotation, registration, segmentation and diagnosis in both tr
Parts assembly clearance measurement is facing a trend towards high-precision and noncontact. This work aims to measure clearance by imageprocessing based on machinevision. The machinevision system is to highlight ...
详细信息
3D pose estimation based on a monocular camera can be applied to various fields such as human-computer interaction and human action recognition. As a two-stage 3D pose estimator, VideoPose3D achieves state-of-the-art ...
详细信息
ISBN:
(纸本)9784901122207
3D pose estimation based on a monocular camera can be applied to various fields such as human-computer interaction and human action recognition. As a two-stage 3D pose estimator, VideoPose3D achieves state-of-the-art accuracy. However, because of the limitation of two-stage processing, image information is partially lost in the process of mapping 2D poses to 3D space, which results in limited final accuracy. This paper proposes an image-assisting pose estimation model and a back-projection based offset generating module. The image-assisting pose estimation model consists of a 2D pose processing branch and an imageprocessing branch. image information is processed to generate an offset to refine the intermediate 3D pose produced by the 2D pose processing network. The back-projection based offset generating module projects the intermediate 3D poses to 2D space and calculates the error between the projection and input 2D pose. With the error combining with extracted image feature, the neural network generates an offset to decrease the error. By evaluation, the accuracy on each action of Human3.6M dataset gets an average improvement of 0.9 mm over the VideoPose3D baseline.
Nowadays, laser vision systems have allowed the development of different applications such as reverse engineering, manufacturing, navigation systems and, structural health monitoring (SHM). However, most of the machin...
详细信息
Nowadays, laser vision systems have allowed the development of different applications such as reverse engineering, manufacturing, navigation systems and, structural health monitoring (SHM). However, most of the machinevision systems for structural behavior analysis have restricted field of view, consume high levels of computational resources for imageprocessing and require special illumination conditions to achieve lower error rates. Therefore, the purpose of this paper is to present a technical vision system (TVS) for structural behavior analysis using dynamic laser triangulation and k-Nearest Neighbor (k-NN) machine learning regression algorithm. The proposed vision system was tested in order to demonstrate the practicality of it, different deformations and displacements were analyzed over real structures in controlled laboratory conditions to assure the reproducibility of the experimentation. The TVS prototype proved to be a reliable option on SHM tasks, presenting balance between precision and operating ranges, without the issues aforementioned.
This master thesis focuses on the cutting-edge application of AI in developing intrusion detection systems (IDS) for unmanned aerial vehicles (UAVs) in smart cities. The objective is to address the escalating problem ...
This master thesis focuses on the cutting-edge application of AI in developing intrusion detection systems (IDS) for unmanned aerial vehicles (UAVs) in smart cities. The objective is to address the escalating problem of UAV intrusions, which pose a significant risk to the safety and security of citizens and critical infrastructure. The thesis explores the current state of the art and provides a comprehensive understanding of recent advancements in the field, encompassing both physical and network attacks. The literature review examines various techniques and approaches employed in the development of AI-based IDS. This includes the utilization of machine learning algorithms, computer vision technologies, and edge computing. A proposed solution leveraging computer vision technologies is presented to detect and identify intruding UAVs in the sky effectively. The system employs machine learning algorithms to analyze video feeds from city-installed cameras, enabling real-time identification of potential intrusions. The proposed approach encompasses the detection of unauthorized drones, dangerous UAVs, and UAVs carrying suspicious payloads. Moreover, the thesis introduces a Cycle GAN network for image denoising that can translate noisy images to clean images without the need for paired training data. This approach employs two generators and two discriminators, incorporating a cycle consistency loss that ensures the generated images align with their corresponding input images. Furthermore, a distributed architecture is proposed for processing collected images using an edge-offloading approach within the UAV network. This architecture allows flying and ground cameras to leverage the computational capabilities of their IoT peers to process captured images. A hybrid neural network is developed to predict, based on input tasks, the potential edge computers capable of real-time processing. The edge-offloading approach reduces the computational burden on the centralized system a
In recent years, there has been a sharp increase in transmission of images to remote servers specifically for the purpose of computer vision. In many applications, such as surveillance, images are mostly transmitted f...
详细信息
ISBN:
(纸本)9781665492577
In recent years, there has been a sharp increase in transmission of images to remote servers specifically for the purpose of computer vision. In many applications, such as surveillance, images are mostly transmitted for automated analysis, and rarely seen by humans. Using traditional compression for this scenario has been shown to be inefficient in terms of bit-rate, likely due to the focus on human based distortion metrics. Thus, it is important to create specific image coding methods for joint use by humans and machines. One way to create the machine side of such a codec is to perform feature matching of some intermediate layer in a Deep Neural Network performing the machine task. In this work, we explore the effects of the layer choice used in training a learnable codec for humans and machines. We prove, using the data processing inequality, that matching features from deeper layers is preferable in the sense of rate-distortion. Next, we confirm our findings empirically by re-training an existing model for scalable human-machine coding. In our experiments we show the trade-off between the human and machine sides of such a scalable model, and discuss the benefit of using deeper layers for training in that regard.
Face presentation attacks, also known as spoofing attacks, pose a substantial threat to biometric systems that rely on facial recognition systems, such as access control systems, mobile payments, and identity verifica...
详细信息
In the present time, there has been many adaptations of Object Detection is developed. Object Detection means catching the object name and it's other characteristics in an image or a video. This field is known to ...
详细信息
This paper describes the development of a low-cost software, called Rat Steps, which allows the obtention of quantitative data (total distance traveled and average speed) as well as the graphic trajectory performed by...
详细信息
ISBN:
(纸本)9783030706012;9783030706005
This paper describes the development of a low-cost software, called Rat Steps, which allows the obtention of quantitative data (total distance traveled and average speed) as well as the graphic trajectory performed by an animal in the open field test. This behavioral test is widely used in neuroscience in order to visualize locomotor impairment following acute brain injury, including stroke, as well as the effect of experimental therapies for these neural disorders. The main tools used for the software development were digital imageprocessing techniques, Python programming, OpenCV library and machine learning algorithms, including the Mean Shift method. The software was successfully developed with effective obtention of quantitative parameters from the Open Field Test, which allows several applications in neuroscience research.
When using traditional phase-shift profilometry for 3D measurement, it is necessary to keep the measured object static during the shooting process. When the measured object is moving, errors will occur if the projecti...
详细信息
暂无评论