This paper presents a Radon transform-based approach to the detection of linear features in images characterized by high noise levels. This approach is based on the localized radon transform where the intensity integr...
详细信息
ISBN:
(纸本)0818658258
This paper presents a Radon transform-based approach to the detection of linear features in images characterized by high noise levels. This approach is based on the localized radon transform where the intensity integration is performed over short line segments rather than across the entire image. The algorithm, referred to as the feature space line detector (FSLD) algorithm, is tested on synthetic images of linear features with very high noise levels. The results of this testing demonstrate the algorithm's robustness in the presence of noise, as well as its ability to detect and localize linear features that are significantly shorter than the image dimensions or that display some curvature.
To address the challenges of non-cooperative, large-distance human signature defection, we present a novel multimodal remote audio/video acquisition system. The system mainly consists of a laser Doppler virbometer (LD...
详细信息
ISBN:
(纸本)9781424439942
To address the challenges of non-cooperative, large-distance human signature defection, we present a novel multimodal remote audio/video acquisition system. The system mainly consists of a laser Doppler virbometer (LDV) and a pan-tilt-zoom (PTZ) camera. The LDV is a unique remote hearing sensor that uses the principle of laser interferometry. However, it needs an appropriate surface to modulate the speech of a human subject and reflect the laser beam to the LDV receiver. The manual operation to turn the laser beam onto a target is very difficult at a distance of more than 20 meters. Therefore, the PTZ camera is used to capture the video of the human subject, track the subject when he/she moves, and analyze the image to get a good reflection surface for LDV measurements in real-time. Experiments show that the integration of those two sensory components is ideal for multimodal human signature detection at a large distance.
The problem of object recognition is addressed. In the literature this task has been generally considered in a 'passive' perspective, where everything is static and there is no definite relation between the ob...
详细信息
ISBN:
(纸本)0818658258
The problem of object recognition is addressed. In the literature this task has been generally considered in a 'passive' perspective, where everything is static and there is no definite relation between the object and its environment. We propose an 'active' approach for object recognition, based on the capability of the observer to move and give a better description of the object under consideration and also to take advantage of the relations between the objects and the environment. This can be accomplished at the task level and at the sensor level. The face recognition problem, based on the face-space approach, is considered to demonstrate the advantage of adopting an active retina to sample the face, build a database and perform the recognition task. By using an active space-variant retina the size of the database is considerably reduced and consequently also the processing time for recognition. A comparative experiment using the active and static approach is presented.
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development...
详细信息
ISBN:
(纸本)9798350302493
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development. In this paper, we investigate the impact of image anonymization for training computervision models on key computervision tasks (detection, instance segmentation, and pose estimation). Specifically, we benchmark the recognition drop on common detection datasets, where we evaluate both traditional and realistic anonymization for faces and full bodies. Our comprehensive experiments reflect that traditional image anonymization substantially impacts final model performance, particularly when anonymizing the full body. Furthermore, we find that realistic anonymization can mitigate this decrease in performance, where our experiments reflect a minimal performance drop for face anonymization. Our study demonstrates that realistic anonymization can enable privacy-preserving computervision development with minimal performance degradation across a range of important computervision benchmarks.
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including ...
详细信息
ISBN:
(纸本)9781665448994
Climate change is a pressing issue that is currently affecting and will affect every part of our lives. It's becoming incredibly vital we, as a society, address the climate crisis as a universal effort, including those in the computervision (CV) community. In this work, we analyze the total cost of CO2 emissions by breaking it into (1) the architecture creation cost and (2) the life-time evaluation cost. We show that over time, these costs are non-negligible and are having a direct impact on our future. Importantly, we conduct an ethical analysis of how the CV-community is unintentionally overlooking its own ethical AI principles by emitting this level of CO2. To address these concerns, we propose adding "enforcement" as a pillar of ethical AI and provide some recommendations for how architecture designers and broader CV community can curb the climate crisis.
This paper presents a method for automatic sign language recognition that was utilized in the cvpr 2021 ChaLearn Challenge (RGB track). Our method is composed of several approaches combined in an ensemble scheme to pe...
详细信息
ISBN:
(纸本)9781665448994
This paper presents a method for automatic sign language recognition that was utilized in the cvpr 2021 ChaLearn Challenge (RGB track). Our method is composed of several approaches combined in an ensemble scheme to perform isolated sign-gesture recognition. We combine modalities of video sample frames processed by a 3D ConvNet (I3D), with body-pose information in the form of joint locations processed by a Transformer, hand region images transformed into a semantic space, and linguistically defined locations of hands. Although the individual models perform sub-par (60% to 93% accuracy on validation data), the weighted ensemble results in 95.46% accuracy.
Existing computervision research in artwork struggles with artwork's fine-grained attributes recognition and lack of curated annotated datasets due to their costly creation. In this work, we use CLIP (Contrastive...
详细信息
ISBN:
(纸本)9781665448994
Existing computervision research in artwork struggles with artwork's fine-grained attributes recognition and lack of curated annotated datasets due to their costly creation. In this work, we use CLIP (Contrastive Language-Image Pre-Training) [12] for training a neural network on a variety of art images and text pairs, being able to learn directly from raw descriptions about images, or if available, curated labels. Model's zero-shot capability allows predicting the most relevant natural language description for a given image, without directly optimizing for the task. Our approach aims to solve 2 challenges: instance retrieval and fine-grained artwork attribute recognition. We use the iMet Dataset [20], which we consider the largest annotated artwork dataset. Our code and models will be available at https://***/KeremTurgutlu/clip_art
We have designed and implemented a system for real-time detection qi 2-D features on a reconfigurable computer based on Field Programmable Gate Arrays (FPGA's). We envision this device as the front-end si a system...
详细信息
ISBN:
(纸本)0818684976
We have designed and implemented a system for real-time detection qi 2-D features on a reconfigurable computer based on Field Programmable Gate Arrays (FPGA's). We envision this device as the front-end si a system able to track image features in real-time control applications like autonomous vehicle navigation. The algorithm employed to select good features is inspired by Tomasi and Kanade's method. Compared to the original method, the algorithm that we have devised does not require any floating point or transcendental operations, and can be implemented either in hardware or in software. Moreover, it maps efficiently into a highly pipelined architecture, well suited to implementation in FPGA technology. We have implemented the algorithm on a lour-cost reconfigurable computer and have observed reliable operation on an image stream generated by a standard NTSC video camera at 30 Hz.
When creating a new labeled dataset, human analysts or data reductionists must review and annotate large numbers of images. This process is time consuming and a barrier to the deployment of new computervision solutio...
详细信息
ISBN:
(纸本)9781665448994
When creating a new labeled dataset, human analysts or data reductionists must review and annotate large numbers of images. This process is time consuming and a barrier to the deployment of new computervision solutions, particularly for rarely occurring objects. To reduce the number of images requiring human attention, we evaluate the utility of images created from 3D models refined with a generative adversarial network to select confidence thresholds that significantly reduce false alarms rates. The resulting approach has been demonstrated to cut the number of images needing to be reviewed by 50% while preserving a 95% recall rate, with only 6 labeled examples of the target.
This paper describes an Active Character recognition methodology henceforth referred to as ACR. We present in this paper a method that uses an active heuristic function similar to the one used by A* search algorithm t...
详细信息
This paper describes an Active Character recognition methodology henceforth referred to as ACR. We present in this paper a method that uses an active heuristic function similar to the one used by A* search algorithm that adaptively determines the length of the feature vector as well as the features themselves used to classify an input pattern.. ACR adapts to factors such as the quality of the input pattern, its intrinsic similarities and differences from patterns of other classes it is being compared against and the processing time available. Furthermore, the finer resolution is accorded to only certain "zones" of the input pattern rr which are deemed important given the classes that are being discriminated. Experimental results support the methodology presented. recognition rate of ACR is about 96% on the NIST data sets and the speed is better than traditional classification methods.
暂无评论