Humans outperform object recognizers despite the fact that models perform well on current datasets, including those explicitly designed to challenge machines with debiased images or distribution shift. This problem pe...
详细信息
ISBN:
(纸本)9781713899921
Humans outperform object recognizers despite the fact that models perform well on current datasets, including those explicitly designed to challenge machines with debiased images or distribution shift. This problem persists, in part, because we have no guidance on the absolute difficulty of an image or dataset making it hard to objectively assess progress toward human-level performance, to cover the range of human abilities, and to increase the challenge posed by a dataset. We develop a dataset difficulty metric MVT, Minimum Viewing Time, that addresses these three problems. Subjects view an image that flashes on screen and then classify the object in the image. images that require brief flashes to recognize are easy, those which require seconds of viewing are hard. We compute the imageNet and ObjectNet image difficulty distribution, which we find significantly undersamples hard images. Nearly 90% of current benchmark performance is derived from images that are easy for humans. Rather than hoping that we will make harder datasets, we can for the first time objectively guide dataset difficulty during development. We can also subset recognition performance as a function of difficulty: model performance drops precipitously while human performance remains stable. Difficulty provides a new lens through which to view model performance, one which uncovers new scaling laws: vision-language models stand out as being the most robust and human-like while all other techniques scale poorly. We release tools to automatically compute MVT, along with image sets which are tagged by difficulty. Objective image difficulty has practical applications - one can measure how hard a test set is before deploying a real-world system - and scientific applications such as discovering the neural correlates of image difficulty and enabling new object recognition techniques that eliminate the benchmark-vsreal-world performance gap.
This study proposes a way to detect vitamin deficiency by combining machine learning and imageprocessing. Computer vision enables the system to recognise visual symptoms of specific vitamin deficiencies. The recommen...
详细信息
This paper uses the machinevision method to identify the skirt module. We have constructed three kinds of machine recognition models of skirt profile processing, structure analysis of style drawing, and size estimati...
详细信息
image segmentation is a key task in computer vision and imageprocessing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and ...
详细信息
image segmentation is a key task in computer vision and imageprocessing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among others, and numerous segmentation algorithms are found in the literature. Against this backdrop, the broad success of deep learning (DL) has prompted the development of new image segmentation approaches leveraging DL models. We provide a comprehensive review of this recent literature, covering the spectrum of pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the relationships, strengths, and challenges of these DL-based segmentation models, examine the widely used datasets, compare performances, and discuss promising research directions.
The studies will be carried out using optical metrology methods on a Walter Helicheck inspection machine in reflected light and a number of images were stored to form a statistical sample. Established new indicators a...
详细信息
ISBN:
(纸本)9781510667877;9781510667884
The studies will be carried out using optical metrology methods on a Walter Helicheck inspection machine in reflected light and a number of images were stored to form a statistical sample. Established new indicators and criteria for grinding efficiency based on imageprocessing of the helical groove of the end mill. As a result, recommendations for the selection of optical control techniques were made for the first time at the intermediate stage of technological preparation for production, in real time, and after processing. In this work, for the first time, we prove the possibility of determining the camera displacement pith distance during continuous scanning of the profile of a helical surface in a radial section, the measurement accuracy and recreating a three-dimensional model of the object. As a result of the work of the new algorithm using the Haar-wavelet with new indicators, it was established that the actual one is located inside the focal zone, which proves the possibility of applied application of the method of monitoring the shape of helical flute of end mills using computer vision. The measurement accuracy of the helical flute increased from 4 to 12% along its profile.
Agriculture is often known as the art and science of nurturing soil. It involves preparing plants and animals for use in products. Agriculture is the process of growing crops and rearing animals for human consumption,...
详细信息
With the increasing interest in augmented and virtual reality, visual localization is acquiring a key role in many downstream applications requiring a real-time estimate of the user location only from visual streams. ...
详细信息
ISBN:
(纸本)9783031431470;9783031431487
With the increasing interest in augmented and virtual reality, visual localization is acquiring a key role in many downstream applications requiring a real-time estimate of the user location only from visual streams. In this paper, we propose an optimized hierarchical localization pipeline by specifically tackling cultural heritage sites with specific applications in museums. Specifically, we propose to enhance the Structure from Motion (SfM) pipeline for constructing the sparse 3D point cloud by a-priori filtering blurred and near-duplicated images. We also study an improved inference pipeline that merges similarity-based localization with geometric pose estimation to effectively mitigate the effect of strong outliers. We show that the proposed optimized pipeline obtains the lowest localization error on the challenging Bellomo dataset [11]. Our proposed approach keeps both build and inference times bounded, in turn enabling the deployment of this pipeline in real-world scenarios.
With the rapidly increase of population every day, it has become a major issue to fulfill everyone's need for food products (i.e., vegetables, fruits, milk, wheat, etc.) due to limited production of food products....
详细信息
With the rapidly increase of population every day, it has become a major issue to fulfill everyone's need for food products (i.e., vegetables, fruits, milk, wheat, etc.) due to limited production of food products. Moreover, healthy food utilization among people is the foremost requirement. The major factors that affect the food system includes increasing food shortage, decreasing quality, wastage, and loss of food products, limited natural resources, etc. This article addresses the various computer vision and machine learning based techniques, used to minimize the aforementioned issues. imageprocessing has become an effective technique for the analysis of many research applications. This study intends to focus on analysis of imageprocessing based applications in food products and agriculture field. Such applications help in decision making , disease prediction, classification, fruit sorting, soil quality measurement, etc. Moreover, a comprehensive review has been accomplished for various computer vision and statistical approaches used in food production and agricultural field and concludes that Deep Learning (DL) based approaches produce better results, specifically for imageprocessingapplications. Additionally, an effort has been made to provide a list of publicly available datasets for the related study.
Mamba, a State Space Model (SSM), has recently shown competitive performance to Convolutional Neural Networks (CNNs) and Transformers in Natural Language processing and general sequence modeling. Various attempts have...
详细信息
To achieve the recognition and positioning functions of indoor mobile robots under limited computing power conditions, a method based on color recognition for robot recognition and positioning is proposed. The global ...
详细信息
暂无评论