The proceedings contain 189 papers. The topics discussed include: optimal feature learning and discriminative framework for polarimetric thermal to visible face recognition;discovery of facial motions using deep machi...
ISBN:
(纸本)9781509006410
The proceedings contain 189 papers. The topics discussed include: optimal feature learning and discriminative framework for polarimetric thermal to visible face recognition;discovery of facial motions using deep machine perception;customized expression recognition for performance-driven cutout character animation;going deeper in facial expression recognition using deep neural networks;discriminative FaceTopics for face recognition via latent Dirichlet allocation;can we still avoid automatic face detection?;OpenFace: an open source facial behavior analysis toolkit;correlation filter cascade for facial landmark localization;face recognition using deep multi-pose representations;effect of illicit drug abuse on face recognition;unconstrained face verification using deep CNN features;frontal to profile face verification in the wild;and capturing facial videos with kinect 2.0: a multithreaded open source tool and database.
The proceedings contain 12 papers. The topics discussed include: activity recognition applications from contextual video-text fusion;multi-objective detector and tracker parameter optimization via NSGA-II;depth map ge...
ISBN:
(纸本)9781479966820
The proceedings contain 12 papers. The topics discussed include: activity recognition applications from contextual video-text fusion;multi-objective detector and tracker parameter optimization via NSGA-II;depth map generation for aerial video in natural scenery;context exploitation in intelligence, surveillance, and reconnaissance for detection algorithms;object detection in low resolution overhead imagery;3D urban reconstruction from wide area aerial surveillance video;monitoring giraffe behavior in thermal video;evolutionary computational methods for optimizing the classification of sea stars in underwater images;dolphin detection and tracking;automated detection of rockfish in unconstrained underwater videos using haar cascades;and discovery of sets of mutually orthogonal vanishing points in videos.
The proceedings contain 155 papers. The topics discussed include: visual recognition to access and analyze people density and flow patterns in indoor environments;online visual tracking using temporally coherent part ...
ISBN:
(纸本)9781479966820
The proceedings contain 155 papers. The topics discussed include: visual recognition to access and analyze people density and flow patterns in indoor environments;online visual tracking using temporally coherent part cluster;real time multi-vehicle tracking and counting at intersections from a fisheye camera;adaptive local movement modeling for object tracking;Bayesian multi-object tracking using motion context from multiple objects;multi-person tracking based on body parts and online random ferns learning of thermal images;generalized sum of Gaussians for real-time human pose tracking from a single depth sensor;qualitative tracking performance evaluation without ground-truth;enhancing linear programming with motion modeling for multi-target tracking;and part-based tracking via salient collaborating features.
Typical text recognition methods rely on an encoder-decoder structure, in which the encoder extracts features from an image, and the decoder produces recognized text from these features. In this study, we propose a si...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Typical text recognition methods rely on an encoder-decoder structure, in which the encoder extracts features from an image, and the decoder produces recognized text from these features. In this study, we propose a simpler and more effective method for text recognition, known as the Decoder-only Transformer for Optical Character Recognition (DTrOCR). This method uses a decoder-only Transformer to take advantage of a generative language model that is pre-trained on a large corpus. We examined whether a generative language model that has been successful in natural language processing can also be effective for text recognition in computervision. Our experiments demonstrated that DTrOCR outperforms current state-of-the-art methods by a large margin in the recognition of printed, handwritten, and scene text in both English and Chinese.
The finding of this study created a design plan for improving the traditional Bayesian optimization algorithm logic by inserting Hidden Markov Chain and human preference, to avoid Bayesian algorithm self-trap in local...
详细信息
ISBN:
(纸本)9798350370287;9798350370713
The finding of this study created a design plan for improving the traditional Bayesian optimization algorithm logic by inserting Hidden Markov Chain and human preference, to avoid Bayesian algorithm self-trap in local. Additionally, this paper created a novelty model as the example case to help explaining the new logic. This paper stands on the creative computing approach to enrich the classical pure measurements (CIELAB colour standard) with visual intensity parameters. The new optical intensity colour model services the chip carrier, which is a high-speed vision-task photons chip design published in Nature at 25 Oct 2023 [1]. The result model structure is expected to apply for the photons-based computer chip in the perspective of vision intensity optimization, such as future optically based virtual reality human-computer interaction applications.
Skiing is a popular winter sport discipline with a long history of competitive events. In this domain, computervision has the potential to enhance the understanding of athletes' performance, but its application l...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Skiing is a popular winter sport discipline with a long history of competitive events. In this domain, computervision has the potential to enhance the understanding of athletes' performance, but its application lags behind other sports due to limited studies and datasets. This paper makes a step forward in filling such gaps. A thorough investigation is performed on the task of skier tracking in a video capturing his/her complete performance. Obtaining continuous and accurate skier localization is preemptive for further higher-level performance analyses. To enable the study, the largest and most annotated dataset for computervision in skiing, SkiTB, is introduced. Several visual object tracking algorithms, including both established methodologies and a newly introduced skier-optimized baseline algorithm, are tested using the dataset. The results provide valuable insights into the applicability of different tracking methods for vision-based skiing analysis. SkiTB, code, and results are available at https://***/datasets/skitb.
An emerging class of Fizeau optical telescopes have the potential to upend prior cost scaling models, substantially improving the angular resolution and contrast attainable by ground-based astronomical instruments. Ho...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
An emerging class of Fizeau optical telescopes have the potential to upend prior cost scaling models, substantially improving the angular resolution and contrast attainable by ground-based astronomical instruments. However, this design introduces a challenging visual control problem that must be solved to compensate for wavefront aberrations induced by the flexible substructure it employs. We subvert this problem with a deep optics approach to policy design and image recovery that exploits, rather than corrects, aberrations to obtain domain-specific object recovery performance exceeding that of more costly filled aperture designs.
Controlling illumination can generate high quality information about object surface normals and depth discontinuities at a low computational cost. In this work we demonstrate a robot workspace-scaled controlled illumi...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
Controlling illumination can generate high quality information about object surface normals and depth discontinuities at a low computational cost. In this work we demonstrate a robot workspace-scaled controlled illumination approach that generates high quality information for table top scale objects for robotic manipulation. With our low angle of incidence directional illumination approach, we can precisely capture surface normals and depth discontinuities of monochromatic Lambertian objects. We show that this approach to shape estimation is 1) valuable for general purpose grasping with a single point vacuum gripper, 2) can measure the deformation of known objects, and 3) can estimate pose of known objects and track unknown objects in the robot's workspace.
This paper introduces SphereCraft, a dataset specifically designed for spherical keypoint detection, matching, and camera pose estimation. The dataset addresses the limitations of existing datasets by providing extrac...
详细信息
ISBN:
(纸本)9798350318920;9798350318937
This paper introduces SphereCraft, a dataset specifically designed for spherical keypoint detection, matching, and camera pose estimation. The dataset addresses the limitations of existing datasets by providing extracted keypoints from various detectors, along with their ground truth correspondences. Synthetic scenes with photo-realistic rendering and accurate 3D meshes are included, as well as real-world scenes acquired from different spherical cameras. SphereCraft enables the development and evaluation of algorithms targeting multiple camera viewpoints, advancing the state-of-the-art in computervision tasks involving spherical images. Our dataset is available at https://***/spherecraftweb/.
Contrastive learning, a dominant self-supervised technique, emphasizes similarity in representations between augmentations of the same input and dissimilarity for different ones. Although low contrastive loss often co...
详细信息
ISBN:
(纸本)9798350370287;9798350370713
Contrastive learning, a dominant self-supervised technique, emphasizes similarity in representations between augmentations of the same input and dissimilarity for different ones. Although low contrastive loss often correlates with high classification accuracy, recent studies challenge this direct relationship, spotlighting the crucial role of inductive biases. We delve into these biases from a clustering viewpoint, noting that contrastive learning creates locally dense clusters, contrasting the globally dense clusters from supervised learning. To capture this discrepancy, we introduce the "RLD (Relative Local Density)" metric. While this cluster property can hinder linear classification accuracy, leveraging a Graph Convolutional Network (GCN) based classifier mitigates this, boosting accuracy and reducing parameter requirements. The code is available here.
暂无评论