In the dynamic field of machine learning, foundation models have recently gained prominence, particularly for their application in natural language processing and computer vision. The foundational Segment Anything Mod...
详细信息
Unlike conventional frame-based cameras that form images by sampling all pixels within the duration of the global/rolling shutter, a pixel in an event camera can be triggered independently when the log intensity chang...
详细信息
Unlike conventional frame-based cameras that form images by sampling all pixels within the duration of the global/rolling shutter, a pixel in an event camera can be triggered independently when the log intensity change in scene luminance at the pixel exceeds a threshold. This unique feature provides several advantages over conventional sensors, including high dynamic range (HDR) (≈120dB), high temporal rate (≈10,000Hz), low latency (< 1ms), and low power requirements (≈10mW). These properties make them excellent candidates for applications such as high-speed photography, HDR image reconstruction, object tracking, depth estimation, simultaneous localization and mapping, and surveillance and monitoring. Despite their potential, the asynchronous and spatially sparse nature of events poses challenges to event processing and interpretation. This is because most advanced imageprocessing and computer vision algorithms are designed to work with conventional image formats, and not with temporally dense streams of asynchronous pixel events (i.e., the event stream). Although emerging techniques in supervised machine learning demonstrate promise, continued and rapid progress relies on the availability of labeled event datasets, which are scarce, and difficult to produce. Moreover, generating reliable events for training models is challenging due to the scene-dependent nature of event generation, which is further complicated by varying illumination and relative motion. In this thesis, we attempt to address these limitations with a novel imaging paradigm involving the capture of frames from a conventional frame-based camera that has been spatially aligned and temporally synchronized with an event sensor. Our active illumination source allows us to generate events more consistently even under challenging illumination and motion in the scene. We demonstrate the feasibility of such a setup for a mobile eye-tracking system and acquire subpixel and microsecond accurate spatiotemporal
This research study focuses on developing an advanced machine learning system for the accurate classification of brain tumors using MRI scans. Traditional manual diagnosis consumes more time in deciding the type of br...
详细信息
Satellite image classification is crucial in various applications such as urban planning,environmental monitoring,and land use *** this study,the authors present a comparative analysis of different supervised and unsu...
详细信息
Satellite image classification is crucial in various applications such as urban planning,environmental monitoring,and land use *** this study,the authors present a comparative analysis of different supervised and unsupervised learning methods for satellite image classification,focusing on a case study in Casablanca using Landsat 8 *** research aims to identify the most effective machine-learning approach for accurately classifying land cover in an urban *** methodology used consists of the pre-processing of Landsat imagery data from Casablanca city,the authors extract relevant features and partition them into training and test sets,and then use random forest(RF),SvM(support vector machine),classification,and regression tree(CART),gradient tree boost(GTB),decision tree(DT),and minimum distance(MD)*** a series of experiments,the authors evaluate the performance of each machine learning method in terms of accuracy,and Kappa *** work shows that random forest is the best-performing algorithm,with an accuracy of 95.42%and 0.94 Kappa *** authors discuss the factors of their performance,including data characteristics,accurate selection,and model influencing.
machinevision has extensive applications in agriculture, including developing efficient land management, precise fruit ripeness grading, and plant disease detection. Palm leaves are distinct in their botanical charac...
详细信息
ISBN:
(纸本)9798350372113;9798350372106
machinevision has extensive applications in agriculture, including developing efficient land management, precise fruit ripeness grading, and plant disease detection. Palm leaves are distinct in their botanical characteristics and have diverse users. However, they are susceptible to diseases, making early detection crucial for maintaining their health and productivity. This study includes preparing balanced data with classes of palm leaf diseases through data augmentation and implementing convolutional neural networks (CNN) in multi-image classification using the processed dataset. Aside from CNN, transfer learning was applied using ResNet152-v2, vGG19, DenseNet201, MobileNet-v2, and InceptionResNetv2 layers to perform image classification. The CNN and imageNet pre-trained functional layers models require 1492s average execution time and allow the average final model losses to be lower than 0.22, and the average final model accuracies are higher than 95%. The average precision, recall, and F1-score in predicting the brown spots, healthy, and white scale classes are more than 90% for all applied functional layers.
Nutrition is an important aspect of public health, and in recent years, there has been increasing interest in the nutritional information of food. However, processing this information can be a challenging task due to ...
详细信息
ISBN:
(纸本)9798350336672
Nutrition is an important aspect of public health, and in recent years, there has been increasing interest in the nutritional information of food. However, processing this information can be a challenging task due to the large amounts of data involved. machine learning (ML) has emerged as a useful tool to address this challenge. In this paper, we present a data resource that uses the FoodData Central (FDC) nutrient database to explore the combination of food images, nutritional information, and text with ML. We begin by providing an overview of machine learning and its applications in nutrition research, including the use of ML algorithms to identify food intake patterns, predict nutrient intakes, and evaluate dietary guidelines. We then describe the features and applications of Inception-v3, Inception-v4, and MobileNetv2 in ML, highlighting how these models can be used to extract nutritional information from food images. To further explore the potential of ML in nutrition research, we developed a quick search app that integrates images, text, and nutritional information. This app uses image recognition algorithms to identify food items in pictures, and text processing techniques to extract food information from text data. Users can simply take a picture of a food item and the app will provide the details of its nutritional content. This app can be used to facilitate the study of food and nutrition information and help promote healthier eating habits. In conclusion, the development of data resources and apps that use ML algorithms can be particularly helpful in processing large amounts of nutrition data and making it more accessible to the public. By harnessing the power of ML, we can advance our understanding of the relationship between diet and health, and ultimately work towards improving public health outcomes.
Background: Recent advances in signal processing technology and computational power have increased the attention towards computer vision-based techniques in diverse applications such as agriculture, food processing, b...
详细信息
Background: Recent advances in signal processing technology and computational power have increased the attention towards computer vision-based techniques in diverse applications such as agriculture, food processing, biomedical, and military. Especially in agricultural and food processing, computer vision can replace most of the manual methods for screening of seed, grain and food quality. Scope and approach: The objective of present study is to review the recent advancements in computer vision techniques for predicting quality of various raw materials and food products. This review paper is focused on the quality determination of grains, vegetables, fruits, beverages, meat, sea food and edible oils using Digital imageprocessing (DIP). Several studies have reported the successful applications of DIP techniques for feature extraction, classification and quality prediction of foods. DIP algorithms are used to extract the significant features from images which are further used as input for machine learning (ML) algorithms to classify them based on different criteria. These feature extraction methods have been improved by Deep Learning (DL) algorithms. Features can be automatically extracted by DL algorithms resulting in higher accuracy. DL algorithms require huge data management and computational resources which can be a major limitation. Key findings and conclusion: A significant literature is available for quality estimation of food products by using computer vision algorithms, but they lack commercial exploitation. Android based applications have not yet been developed for this specific purpose. User friendly, low cost and portable devices equipped for quality estimation would be helpful for rapid quality measurement of food products in real time.
Facial recognition technology has gained widespread use in various applications, raising concerns about the weakness of frameworks to confront mocking assaults. This study presents an implementation of face spoofing d...
详细信息
ISBN:
(纸本)9798350395334;9798350395327
Facial recognition technology has gained widespread use in various applications, raising concerns about the weakness of frameworks to confront mocking assaults. This study presents an implementation of face spoofing detection using machine learning techniques. The exploration utilizes a far-reaching system that envelops data combination, preprocessing, incorporate extraction, and model readiness. A diverse dataset comprising genuine and spoofed facial images, representing various spoofing techniques, is utilized. Feature extraction leverages Convolutional Brain Organizations (CNNs) to catch discriminative facial elements. The selected machine learning model is trained and fine-tuned, with a focus on achieving robustness against evolving spoofing methods. The evaluation of the implemented system involves rigorous testing on a separate dataset, utilizing estimations like precision, exactness, survey, and F1-score. The study investigates post-processing techniques and considerations for real-time deployment, ensuring practical applicability is done by the method convolutional neural network (CNN). Cross-approval is performed to evaluate the model's speculation capacities, and the deployment phase explores integration into real-world scenarios. Ethical considerations, user feedback, and compliance with data privacy regulations are integral components of the study.
Face recognition plays a crucial role in various applications, ranging from security to personal convenience. Recent advancements have emphasized the importance of recognizing individuals based on age-related facial f...
详细信息
ISBN:
(纸本)9783031648809;9783031648816
Face recognition plays a crucial role in various applications, ranging from security to personal convenience. Recent advancements have emphasized the importance of recognizing individuals based on age-related facial features within this domain. This paper presents a comprehensive evaluation of two deep learning architectures for age-based face recognition: Siamese Convolutional Networks (SCNs) and vision Transformers (viTs). Convolutional Neural Networks (CNNs), which are critical in modern face recognition, serve as the backbone for Siamese Convolutional Networks (SCNs). SCNs are specifically designed to detect similarities between input pairs by emphasising local features crucial for age-related distinctions. In contrast, viTs, initially developed for natural language processing, have demonstrated promising performance in image recognition, showcasing their aptitude for capturing global image context. This work investigates the performance of these distinct architectures in discerning age-related variations within facial data features. Performance comparisons were conducted on three established SCN models and two viT architectures. The results revealed that the optimal SCNs primarily focused on the mouth, nose, and eye regions, indicating their reliance on local features for age estimation. Interestingly, the viT models achieved superior performance despite lacking explicit feature localization. This suggests that a holistic understanding of the facial context may be more effective than focusing solely on isolated features for age-based recognition.
CNN is inspired from Primary visual (v1) neurons. It is a typical deep learning technique and can help teach machine how to see and identify objects. In the most recent decade, deep learning develops rapidly and has b...
详细信息
CNN is inspired from Primary visual (v1) neurons. It is a typical deep learning technique and can help teach machine how to see and identify objects. In the most recent decade, deep learning develops rapidly and has been well used in various fields of expertise such as computer vision and natural language processing. As the representative algorithm of deep learning, Convolution Neural Network (CNN) has been regarded as a breakthrough of historic significance in imageprocessing and visual recognition tasks since the astonishing results achieved on imageNet Large Scale visual Recognition Competition (ILSvRC) Unlike methods based on handcrafted features, CNN models can build high-level features from low-level ones in a data-driven fashion and have displayed great potential in medical image analysis among the aspects of segmentation of histological images identification, lesion detection, tissue classification, etc. This paper provides a review on CNN from the perspectives of its basic mechanism introduction, structure, typical architecture and main application in medical image analysis through analyzing over 100 references from Google Scholar, PubMed, Web of Science and various sources published from 1958 to 2020.
暂无评论