检索结果-内蒙古大学图书馆

IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

作者： Pang, Ziqi Hu, Zhiyuan Tokmakov, Pavel Wang, Yu-Xiong Hebert, Martial TuSimple Beijing Peoples R China Univ Calif San Diego San Diego CA USA Toyota Res Inst Ann Arbor MI USA UIUC Champaign IL USA CMU Pittsburgh PA USA

ISBN: (纸本)9781665448994

Virtually all of deep learning literature relies on the assumption of large amounts of available training data. Indeed, even the majority of few-shot learning methods rely on a large set of "base classes" for pre-training. This assumption, however, does not always hold. For some tasks, annotating a large number of classes can be infeasible, and even collecting the images themselves can be a challenge in some scenarios. In this paper, we study this problem and call it "Small Data" setting, in contrast to "Big Data." To unlock the full potential of small data, we propose to augment the models with annotations for other related tasks, thus increasing their generalization abilities. In particular, we use the richly annotated scene parsing dataset ADE20K to construct our realistic Long-tail recognition with Diverse Supervision (LRDS) benchmark, by splitting the object categories into head and tail based on their distribution. Following the standard few-shot learning protocol, we use the head classes for representation learning and the tail classes for evaluation. Moreover, we further subsample the head categories and images to generate two novel settings which we call "Scarce-Class" and "Scarce-Image," respectively corresponding to the shortage of training classes and images. Finally, we analyze the effect of applying various additional supervision sources under the proposed settings. Our experiments demonstrate that densely labeling a small set of images can indeed largely remedy the small data constraints. Our code and benchmark are available at https://***/BinahHu/ADE-FewShot.

关键词： Training Learning systems Head Protocols Training data Benchmark testing pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Learning Triadic Belief Dynamics in Nonverbal Communication from Videos

Learning Triadic Belief Dynamics in Nonverbal Communication ...

引用

IEEE/CVF Conference on computer vision and pattern recognition (CVPR)

作者： Fan, Lifeng Qiu, Shuwen Zheng, Zilong Gao, Tao Zhu, Song-Chun Zhu, Yixin Univ Calif Los Angeles Ctr Vis Cognit Learning & Auton Los Angeles CA 90024 USA

ISBN: (纸本)9781665445092

Humans possess a unique social cognition capability [43, 20];nonverbal communication can convey rich social information among agents. In contrast, such crucial social characteristics are mostly missing in the existing scene understanding literature. In this paper, we incorporate different nonverbal communication cues (e.g., gaze, human poses, and gestures) to represent, model, learn, and infer agents' mental states from pure visual inputs. Crucially, such a mental representation takes the agent's belief into account so that it represents what the true world state is and infers the beliefs in each agent's mental state, which may differ from the true world states. By aggregating different beliefs and true world states, our model essentially forms "five minds" during the interactions between two agents. This "five minds" model differs from prior works that infer beliefs in an infinite recursion: instead, agents' beliefs are converged into a "common mind" [31, 47]. Based on this representation, we further devise a hierarchical energy-based model that jointly tracks and predicts all five minds. From this new perspective, a social event is interpreted by a series of nonverbal communication and belief dynamics, which transcends the classic keyframe video summary . . In the experiments, we demonstrate that using such a social account provides a better video summary on videos with rich social interactions compared with state-of-the-art keyframe video summary methods.

关键词： Visualization computer vision Three-dimensional displays Graphical models Heuristic algorithms Predictive models Probabilistic logic

来源：评论

学校读者我要写书评

暂无评论

Research on Binocular Real-Time Ranging method in window area 11

Research on Binocular Real-Time Ranging method in window are...

引用

11th International Symposium on Multispectral Image Processing and pattern recognition (mippr) - pattern recognition and computer vision

作者： Zhang, Liubin Wang, Haihui Li, Jun Wang, Ziwei Wang, Meng Wuhan Inst Technol Sch Comp Sci & Engn Wuhan 430205 Peoples R China Wuhan Inst Technol Hubei Prov Key Lab Intelligent Robot Wuhan 430205 Peoples R China

ISBN: (纸本)9781510636385

In order to solve the problem of difficult target matching and low matching efficiency in binocular measurement, this paper proposes a real-time target feature matching algorithm based on Binocular Stereo vision-absolute window error minimization (CAEW, Calculate the Absolute Error Window) to improve the speed and accuracy of measurements. Firstly, the calibration of the camera is solved by using Zhang's calibration method, and the Bouguet algorithm is used for Binocular Stereo vision of the final calibration data. Then, the AdaBoost iterative algorithm is used to train the target detector for target recognition. The CAEW algorithm is compared with the commonly used SURF (Speeded-Up Robust Feature) algorithm. The evaluation data of experimental results showed that the CAEW algorithm can achieve an evaluation of more than 90%. It is significantly improved compared with the SURF algorithm and meet the needs of binocular real-time target matching.

关键词： binocular stereo vision feature point matching three-dimensional ranging

来源：评论

学校读者我要写书评

暂无评论

Machine vision Based-2D Measurement Method for Industrial Glass 11

Machine Vision Based-2D Measurement Method for Industrial Gl...

引用

11th International Symposium on Multispectral Image Processing and pattern recognition (mippr) - pattern recognition and computer vision

作者： Zhou, Chen Hong, Hanyu Zhang, Xiuhua Zhao, Shuhan Chen, Pan Wuhan Inst Technol Hubei Key Lab Opt Informat & Pattern Recognit Wuhan 430205 Peoples R China Wuhan Inst Technol Hubei Engn Res Ctr Video Image & HD Project Wuhan 430205 Peoples R China Wuhan Inst Technol Sch Elect & Informat Engn Wuhan 430205 Peoples R China

ISBN: (纸本)9781510636385

In order to achieve high efficiency, automatic and accurate measurement, the paper takes the two-dimensional measurement of industrial glass under the experimental *** main contents of this paper includes: Analyzing the structure and hardware performance parameters of the system, building a measuring platform including computer, Charge-coupled Device image sensor, lens, etc, using high-precision camera to take the image of glass, preprocessing of glass image data and acquiring edge information of glass. The system use second filtering method to filter the image and Canny operator to acquire the edge of the industry glass, transforming computer coordinate system into world coordinate system through coordinate transformation method, and finally calculate the two-dimensional size information of industrial *** system measures the two-dimensional length and width of polygonal glass, the experimental results show that the measurement method in this paper meet the accuracy requirements of general industrial measurement, and the detection system is feasible.

关键词： Machine vision Image processing Two-dimensional measurement

来源：评论

学校读者我要写书评

暂无评论

Phase error analysis and compensation for motion in high-speed phase measurement profilometry

引用

OSA CONTINUUM 2021年第4期4卷 1191-1206页

作者： Li, Xuexing Zhang, Wenhui Wuxi Inst Technol Sch Mech Technol Wuxi 214121 Jiangsu Peoples R China Wuxi GNFIR Informat Technol Co Ltd Wuxi 214000 Jiangsu Peoples R China

High-speed three-dimensional (3D) measurement is increasingly important in many fields. Phase measurement profilometry (PMP) based on the binary defocusing technique has been applied to the high-speed 3D measurement scene for its higher measurement resolution and precision, and breaking the speed limitations of projector. However, because the PMP needs three phase-shifting (3-PS) patterns, motion error is inevitable to measuring dynamic objects. In this research, we construct a complete high-speed 3-PS PMP system, and re-derive two clearer motion error models than those in Weise's research [Conference on computer vision and pattern recognition (CVPR) (IEEE, 2007), pp. 1]. Then, we theoretically analyze the effects of the truncation error on the model accuracy, especially when the motion error is higher. To this end, a polynominal-based motion error model by fitting coefficient matrix of pre-simulation is proposed. Meanwhile, its corresponding error compensation method based on local domain estimation of the Nelder-Mead algorithm is developed. Finally, both simulations and quantitative and qualitative experiments verify the accuracy and effectiveness of the proposed method, as well as demonstrate the proposed method has improvements compared with the Weise's research. (c) 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

关键词： High speed imaging Machine vision pattern recognition Phase measurement Phase unwrapping Precision metrology

来源：评论

学校读者我要写书评

暂无评论

Delaunay growth algorithm based on point cloud curvature smoothing improvement 11

Delaunay growth algorithm based on point cloud curvature smo...

引用

11th International Symposium on Multispectral Image Processing and pattern recognition (mippr) - pattern recognition and computer vision

作者： Huang, Ruiqi Hong, Hanyu Wuhan Inst Technol Hubei Key Lab Opt Informat & Pattern Recognit Wuhan 430205 Peoples R China Wuhan Inst Technol Hubei Engn Res Ctr Video Image & HD Project Wuhan 430205 Peoples R China Wuhan Inst Technol Sch Elect & Informat Engn Wuhan 430205 Peoples R China

ISBN: (纸本)9781510636385

In order to meet the requirements of 3D reconstruction in accuracy, reconstruction speed and algorithm applicability, this paper proposes a Delaunay growth algorithm based on point cloud curvature smoothing, which firstly projects a 3D discrete point cloud into a 2D plane and passes a 2D Delaunay triangulation. The two-dimensional Delaunay triangulation is performed by the empty circle criterion and the maximum and minimum angle criterion in the score. The PCA principal component analysis is used to estimate the normal of the three-dimensional point cloud and locate the normal on the same side to avoid the disordered points. The cloud normal, combined with the curvature of the corresponding 3D point cloud, removes the invalid normal in the point cloud due to invalid points and preserves the larger part of the point cloud as much as possible, and finally passes the Delaunay constraint criterion and the evaluation function. Filter the set of alternate points to ensure that the reconstructed triangle approximates the Delaunay triangle. The experimental results show that the reconstruction algorithm proposed in this paper is much better than the traditional greedy triangle projection algorithm and Poisson algorithm and the reconstruction speed is increased by 20%.

关键词： Point cloud curvature Delaunay triangulation Point Cloud Library PCA

来源：评论

学校读者我要写书评

暂无评论

A New Robust Image Feature Point Detector 11

A New Robust Image Feature Point Detector

引用

11th International Symposium on Multispectral Image Processing and pattern recognition (mippr) - Automatic Target recognition and Navigation

作者： Zhao, Yi HuBei Presch Teacher Coll Wuhan 430223 Peoples R China

ISBN: (纸本)9781510636361

A scale space-variant filter (SVF) is proposed on the basis of Harris arithmetic operators, which can smoothly isolate noise efficiently at the situation of keeping edge information of the image. Comparing SVF with Gaussian filter under step jump signal and initial image input, the result indicates that SVF is better than Gaussian filter. Using SVF to detect feature points of an image, the experiment shows that feature points detected from SVF output contain more edge information. Using 2D space limitations, Euclidian distance limitation and angle limitation, we can eliminate redundant feature points so that all the useful feature points are distributed in all regions of the image evenly. From the result of the examination for noise-contained image, we can draw the conclusions that the new robust feature point detector can get more accurate position of feature points and the distribution of the points is more rational than that of the points without those limitations.

关键词： Feature Point Detection computer vision Noise Smooth Space Limitation Robustness

来源：评论

学校读者我要写书评

暂无评论

Residential Real Estate Image Classification for Property Valuation 28th

Residential Real Estate Image Classification for Property Va...

引用

28th International Conference on Image Processing, computer vision, and pattern recognition, IPCV 2024, and 23rd International Conference on Information and Knowledge Engineering, IKE 2024, held as part of the World Congress in computer Science, computer Engineering and Applied Computing, CSCE 2024

作者： Nejad, Mehrdad Ziaee Naderpour, Mohsen Behbood, Vahid Ramezani, Fahimeh Lu, Jie 15 Broadway UltimoNSW2007 Australia

ISBN: (纸本)9783031859328

Residential real estate price is one of the key components of our economic developments and has also been a major concern of the public, bank industry, government, and investors. The accurate estimation of the sale price and its changes have an important role in the decision-making of related departments and organizations. In Australia, one of the biggest investments for people is in residential real estate. Therefore, many studies and research works have been carried out to build an automated valuation model to predict sale prices of residential properties accurately as much as possible. Automatic and accurate image classification of residential real estate plays an important role in property valuation and decision making of both sellers and buyers. It can be used in real estate online websites to organize the images for each property or used as a component in a visual decision support system for predicting the property sale prices based on property images. As convolutional image classification models show valuable performance in comparison with traditional models, a convolutional classification model is developed in this paper which creates a highly reliable classification component to be used in the corresponding research areas. The performance of the proposed model is investigated through a real dataset of New Sales Wales, Australia. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Generating Cartoon Images from Face Photos with Cycle-Consistent Adversarial Networks

引用

computers, Materials & Continua 2021年第11期69卷 2733-2747页

作者： Tao Zhang Zhanjie Zhang Wenjing Jia Xiangjian He Jie Yang School of Artificial Intelligence and Computer Science Jiangnan UniversityWuxi214000China Key Laboratory of Artificial Intelligence Jiangsu214000China The Global Big Data Technologies Centre University of Technology SydneyUltimoNSW2007Australia The Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong UniversityShanghai201100China

The generative adversarial network(GAN)is first proposed in 2014,and this kind of network model is machine learning systems that can learn to measure a given distribution of data,one of the most important applications is style *** transfer is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output ***-GAN is a classic GAN model,which has a wide range of scenarios in style *** its unsupervised learning characteristics,the mapping is easy to be learned between an input image and an output ***,it is difficult for CYCLE-GAN to converge and generate high-quality *** order to solve this problem,spectral normalization is introduced into each convolutional kernel of the *** convolutional kernel reaches Lipschitz stability constraint with adding spectral normalization and the value of the convolutional kernel is limited to[0,1],which promotes the training process of the proposed ***,we use pretrained model(VGG16)to control the loss of image content in the position of l1 *** avoid overfitting,l1 regularization term and l2 regularization term are both used in the object loss *** terms of Frechet Inception Distance(FID)score evaluation,our proposed model achieves outstanding performance and preserves more discriminative *** results show that the proposed model converges faster and achieves better FID scores than the state of the art.

关键词： Generative adversarial network spectral normalization Lipschitz stability constraint VGG16 l1 regularization term l2 regularization term Frechet inception distance

来源：评论

学校读者我要写书评

暂无评论

Tiny Object Detection using Multi-feature Fusion 11

Tiny Object Detection using Multi-feature Fusion

引用

11th International Symposium on Multispectral Image Processing and pattern recognition (mippr) - Automatic Target recognition and Navigation

作者： Yang, Peng Zhao, Yuejin Liu, Ming Dong, Liquan Liu, Xiaohua Hui, Mei Beijing Inst Technol Sch Opt & Photon Beijing Key Lab Precis Photoelect Measuring Instr Beijing 100081 Peoples R China

ISBN: (纸本)9781510636361

Vehicle identification is widely used in route planning, safety supervision and military reconnaissance. It is one of the research hotspots of space-based remote sensing applications. Traditional HOG, Gabor features and Hough transform and other manual design features are not suitable for modern city satellite data analysis. With the rapid development of CNN, object detection has made remarkable progress in accuracy and speed. However, in satellite map analysis, many targets are usually small and dense, which results in the accuracy of target detection often being half or even lower than the big target. Small targets have lower resolution, blurred images, and very rare information. After multi-layer convolution, it is difficult to extract effective information. In the satellite map data set we produced, the target vehicles are not only small but also very dense, and it is impossible to achieve high detection accuracy when using YOLO for training directly. In order to solve this problem, we propose a multi-feature fusion target detection method, which combines satellite image and electronic image to achieve the fusion of target vehicle and surrounding semantic information. We conducted a comparative experiment to demonstrate the applicability of multi-feature fusion methods in different detection models such as YOLO and R-CNN. By comparing with the traditional target detection model, the results show that the proposed method has higher detection accuracy.

关键词： computer vision Fully Convolutional Networks Satellite Imagery Object Detection Multiple Features

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：