Automatically extracting buildings from remotely sensed imagery has always been a challenging task, given the spectral homogeneity of buildings with non-building features as well as the complex structural diversity wi...
详细信息
Automatically extracting buildings from remotely sensed imagery has always been a challenging task, given the spectral homogeneity of buildings with non-building features as well as the complex structural diversity within the image. Traditional machine learning (ML) based methods deeply rely on a huge number of samples and are best suited for medium-resolution images. Unmanned aerial vehicle (UAV) imagery offers the distinct advantage of very high spatial resolution, which is helpful in improving building extraction by characterizing patterns and structures. However, with increased finer details, the number of images also increases many folds in a UAV dataset, which require robust processingalgorithms. Deep learning algorithms, specifically Fully Convolutional Networks (FCNs) have greatly improved the results of building extraction from such high resolution remotely sensed imagery, as compared to traditional methods. This study proposes a deep learning-based segmentation approach to extract buildings by transferring the learning of a deep Residual Network (ResNet) to the segmentation-based FCN U-Net. This combined dense architecture of ResNet and U-Net (Res-U-Net) is trained and tested for building extraction on the open source Inria Aerial image Labelling (IAIL) dataset. This dataset contains 360 orthorectified images with a tile size of 1500 m(2) each, at 30 cm spatial resolution with red, green and blue bands;while covering total area of 805 km(2) in select US and Austrian cities. Quantitative assessments show that the proposed methodology outperforms the current deep learning-based building extraction methods. When compared with a singular U-Net model for building extraction for the IAIL dataset, the proposed Res-U-Net model improves the overall accuracy from 92.85% to 96.5%, the mean F1-score from 0.83 to 0.88 and the mean IoU metric from 0.71 to 0.80. Results show that such a combination of two deep learning architectures greatly improves the building extract
In real-world datasets, leveraging the low-rank and sparsity properties enables developing efficient algorithms across a diverse array of data-related tasks, including compression, compressed sensing, matrix completio...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
In real-world datasets, leveraging the low-rank and sparsity properties enables developing efficient algorithms across a diverse array of data-related tasks, including compression, compressed sensing, matrix completion, etc. Notably, these two properties often coexist in certain real-world datasets, especially in Boolean datasets and quantized real-valued datasets. To harness the advantages of low-rank and sparsity simultaneously, we adopt a technique inspired by compressed sensing and Boolean matrix completion. Our approach entails compressing a low-rank sparse Boolean matrix by performing inner product operations with a randomly generated Boolean matrix. We then propose a decoding algorithms based on message-passing techniques to recover the original matrix. Our experiments demonstrate superior recovery performance of our proposed algorithms compared to Boolean matrix completion, with equal measurement requirements.
In the task of infrared weak and small target recognition, in order to improve the image quality and solve the problem of poor learning ability of convolutional neural network (CNN) due to the imbalance of positive an...
详细信息
Self-driving cars (a.k.a. Autonomous Vehicles) have many challenges to tackle before having them fully deployed in our roads and cities. A critical one, which has been somehow neglected till recently, is to consider t...
详细信息
ISBN:
(纸本)9789492859280
Self-driving cars (a.k.a. Autonomous Vehicles) have many challenges to tackle before having them fully deployed in our roads and cities. A critical one, which has been somehow neglected till recently, is to consider the driver in the system-user loop of vehicle performance. The purpose here is to tackle some of the current pending challenges involved in scaling up the level of autonomy of these systems. We have designed two user-vehicle experiences in two different sites with a common methodology that serves as an umbrella to collect all features required to model the driver-user. These two sites allow us to contrast and fine-tune this modelling issue. The approach consists in following a Learning Apprentice approach, where both the user behaviour and the system behaviour are learned and improved in a symbiotic ecosystem. This paper focuses on discussing the advantages of this approach and the main issues that require further research.
The “Interactive Sign Language Learning System” is a sophisticated application designed to facilitate the learning process of sign language learners. This comprehensive system encompasses several key features, inclu...
详细信息
ISBN:
(数字)9798350368413
ISBN:
(纸本)9798350368420
The “Interactive Sign Language Learning System” is a sophisticated application designed to facilitate the learning process of sign language learners. This comprehensive system encompasses several key features, including sign language alphabet and word recognition, text-to-action conversion for learners, multi-language support, and integrated voice output functionality. The system utilizes advanced algorithms for sign language recognition, employing techniques such as imageprocessing and machine learning to accurately interpret hand gestures and movements. For sign language alphabet and word recognition, a combination of computer vision algorithms, possibly including convolutional neural networks (CNNs) or recurrent neural networks (RNNs), may be employed to analyze input images or video streams and classify them into corresponding sign language symbols or words. Text-to-action conversion involves mapping textual input to corresponding sign language actions or gestures, possibly using natural language processing (NLP) algorithms to understand the semantics of the text and generate appropriate sign language representations. The system’s accuracy, measured in terms of correctly recognized sign language symbols or words, would depend on the effectiveness of the algorithms employed and the quality of the training data. With rigorous development and training, the system aims to achieve high accuracy levels, in sign language recognition and text-to-action conversion tasks.
The Industrial robot visual servo imageprocessing requires highly autonomous and intelligence robotic manipulators, with goal of performing manipulation tasks independently without human interventions. However, limit...
详细信息
ISBN:
(数字)9798350360660
ISBN:
(纸本)9798350360677
The Industrial robot visual servo imageprocessing requires highly autonomous and intelligence robotic manipulators, with goal of performing manipulation tasks independently without human interventions. However, limit efficiency for large scale, sensitive to noise in input data which affect classification accuracy. This paper, proposed Multi-Direction Strategy with Honey Badger Algorithm (MHBA) and Convolutional Neural Network (CNN) for classification is effectively explore the hyper parameter space of CNN technique is help achieve classification accuracy. The MHBA ability to adapt and explore multiple directions in the parameter space makes CNN maximum efficient to variation in input data. The MHBA model is applied to large scale CNN model is efficiently narrowing down the search space and balances exploration and exploitation allowing for search space. The Pre-processing using linear transformation such as translation or rotation help in adjusting the image dimension while maintaining the essential content. The proposed MHBA-CNN technique for industrial robot is achieving better outcomes such as Mean Absolute Error (MAE) of 2.03, Mean Absolute Scaled Error (MASE) of 3.45, R
2
score of 2.45 on raw dataset. The existing technique such as Recurrent Neural Network (RNN) and Long Short Term Memory (LSTM) are evaluated of proposed method.
The problem of visualization, recognition, classification of images of micro-objects, in particular, pollen grains, unicellular organisms, fingerprints based on the definition of their variety, belonging to a class, t...
详细信息
The correlation of the wavelet shape and the characteristic ECG section has been confirmed, and wavelets for the identification of P-, R-, and T waves have been experimentally established. An algorithm for identifying...
详细信息
ISBN:
(数字)9798331518752
ISBN:
(纸本)9798331518769
The correlation of the wavelet shape and the characteristic ECG section has been confirmed, and wavelets for the identification of P-, R-, and T waves have been experimentally established. An algorithm for identifying R waves in an ECG has been developed and its operability has been verified. Using the algorithm developed, a graph of the dynamics of the heartbeat period is plotted as a function of time (period number). To identify the P and T waves, the wavelet image of the real ECG is constructed, and the presence of distortion of the shape of this image due to the influence of QRS complex and neighboring waves is established. Computer removal (filtering) of QRS complexes from the ECG is carried out, wavelet image of the remaining part is calculated. The possibility of determining the positions of the maximum P- and T-waves has been established from the wavelet image of the filtered ECG. The theoretical and practical value of this study consists in a new direction of wavelet analysis of cardiac signals, by successively applying several wavelets of a special form, to identify characteristic sections of the ECG in order to improve diagnostics. The proposed method also assumes intermediate filtering of the identified sections when moving from wavelet to wavelet. The development was carried out within the framework of computer generation of arrays of digital functions of the state of the heart, suitable for training artificial intelligence systems, and diagnostics using such systems. The practical results obtained can be applied in the development of a digital expert diagnostic system or a specific technical device.
The Smart elevator system that adapts elevator stops according to actual occupancy by people in real-time manner. In our proposed system, computer vision and object detection techniques are incorporated in determining...
详细信息
ISBN:
(数字)9798331585600
ISBN:
(纸本)9798331585617
The Smart elevator system that adapts elevator stops according to actual occupancy by people in real-time manner. In our proposed system, computer vision and object detection techniques are incorporated in determining whether people are standing at a floor prior to a stop. An image is taken by the system using a camera that takes a view of the elevator lobby herein recognition of persons is done using machine learning algorithms to make requisite decisions of the elevator's halt. If there are people outside the elevator, a door will open to admit them otherwise the elevator simply skips the floor making movement faster and reduce the travel time. The recommendation decision draws on heuristic image recognition computations like YOLO (You Only Look Once) or SSD (Single Shot MultiBox Detector). By using a camera to monitor the outside area, the system makes a binary decision to pull the emergency brake and open the door or to proceed with the next desired floor. This project sets a foundation for future improvements in smart elevator systems, leveraging IoT integration, advanced machine learning models, and cloudbased real-time data processing to further enhance the intelligence and responsiveness of the system.
The leading cause of visual impairment after cataract, is glaucoma and the only way to combat it is to detect it early. It is imperative to develop a system that can work effectively without a lot of equipment, qualif...
The leading cause of visual impairment after cataract, is glaucoma and the only way to combat it is to detect it early. It is imperative to develop a system that can work effectively without a lot of equipment, qualified medical personnel, and takes less time in order to address this fundamental issue. A Computer-Aided Diagnosis (CAD) system, which employs different algorithms for medical imageprocessing and analysis, can assist in achieving this. One of the ways to diagnose glaucoma is to calculate Optic Cup to Optic Disc ratio (CDR) and this can be done with the help of CAD algorithms. In medical imageprocessing the primary focus is on image segmentationand its classification in order to obtain a result. In this paper, the exploration the best-known CNN model, U-Net for image segmentation of Optic Disc and Optic Cup from a fundus image and Logistic Regression, a classification model to determine a relationship between these two terms rather than previously used CDR formulas.
暂无评论