Robotic grippers are receiving increasing attention in various industries as essential components of robots for interacting and manipulating objects. While significant progress has been made in the past, conventional ...
详细信息
Since its inception, the CUDA programming model has been continuously evolving. Because the CUDA toolkit aims to consistently expose cutting-edge capabilities for general-purpose compute jobs to its users, the added f...
详细信息
Machine vision systems play a pivotal role in streamlining manufacturing processes, notably in quality control through automatic in-line visual inspections. A common practice for inspecting parts, components, and fina...
详细信息
ISBN:
(数字)9781665464543
ISBN:
(纸本)9781665464550
Machine vision systems play a pivotal role in streamlining manufacturing processes, notably in quality control through automatic in-line visual inspections. A common practice for inspecting parts, components, and final products is to use a master part benchmark for quality comparison. However, challenges arise when objects enter inspection points in unintended orientations. This misalignment potentially leads to erroneous decisions by automated systems, resulting in additional checkpoints or wastage affecting the production rate. To tackle this issue, we propose a visual inspection pipeline that leverages recent machine learning-based approaches to compare the inspection target and a master part virtually oriented to the same perspective. Specifically, we suggest combining 3D Gaussian Splatting and DUSt3R as a practical solution. Our approach demonstrates its efficacy in real-world scenarios through testing on three mock parts and a real industrial component.
With the continuous advancement of generative models, face morphing attacks have become a significant challenge for existing face verification systems due to their potential use in identity fraud and other malicious a...
详细信息
Magnetic resonance imaging (MRI) is a potent diagnostic tool, but suffers from long examination times. To accelerate the process, modern MRI machines typically utilize multiple coils that acquire sub-sampled data in p...
详细信息
In recent years, efficient super-resolution research has focused on reducing model complexity and improving efficiency by leveraging deep small-kernel convolution, but it has the problem of a small receptive field, wh...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
In recent years, efficient super-resolution research has focused on reducing model complexity and improving efficiency by leveraging deep small-kernel convolution, but it has the problem of a small receptive field, which leads to a limited ability of the network to reconstruct details. Large kernel convolution can provide a large receptive field and lead to a substantial enhancement in the quality of image reconstruction, but its computational cost is too high. To minimize the model’s parameter count and achieve efficient super-resolution reconstruction, this study introduces a symmetric visual attention network. The network decomposes the large kernel convolution into three different lightweight and efficient convolutions. It then forms a bottleneck structure by leveraging the varied receptive field sizes of these convolutions in combination. The attention mechanism is integrated to create a bottleneck attention module, enhancing the network’s feature awareness. Furthermore, the bottleneck attention modules are symmetrically arranged to construct a symmetric large kernel attention block, thereby further enhancing the network’s capability to extract deep features. The experimental results demonstrate that the proposed model achieves competitive quantitative metrics when compared to other lightweight super-resolution methods, and the details of the reconstructed images are enhanced. With only 183K parameters, the model achieves a lightweight yet high-quality super-resolution model, offering a novel solution approach for efficient super-resolution.
HD Maps are a highly important part of the autonomous driving stack to perform the activities of localization and route planning on the road. Therefore, HD Map Validation is crucial to ensure the HD Maps represent the...
HD Maps are a highly important part of the autonomous driving stack to perform the activities of localization and route planning on the road. Therefore, HD Map Validation is crucial to ensure the HD Maps represent the road information as accurately as possible. While heuristics have been developed to validate HD Maps with reasonable accuracy, they are unable to solve the challenging path to traffic light validation problem of HD Map junctions. Based on the success of Graph Neural Network approaches on HD Map problems, such as Vector net, we propose the P2LNet. This P2LNet architecture consists of a fully connected subgraph followed by a Graph Encoder-Decoder Architecture that finally predicts the correct associations that would exist between paths and traffic lights in the junction. We trained and evaluated P2LNet on our in-house HD Map junction dataset, with P2LNet showing 94% accuracy on predicting the correct labels for the associations from a test set. While the results could be further improved by using edge features, P2LNet provides a significant accuracy in validating incorrect light associations. It also shows how GNN based approaches can be used to solve other significant HD Map and validation issues.
With the increasing demand for sample-efficient and robust reinforcement learning agents, particularly in intricate domains like robotics, healthcare, and gaming, there is a strong need to minimize the computational o...
With the increasing demand for sample-efficient and robust reinforcement learning agents, particularly in intricate domains like robotics, healthcare, and gaming, there is a strong need to minimize the computational overhead caused by the interactions between real and virtual agents. This necessitates highly accurate models to simulate virtual agents and limit the number of such interactions. To this effect, model-based reinforcement learning (MBRL) has been proven very effective in formulating an environment with superior decision-making and higher learning efficiency. A known approach in MBRL is World Models, which uses a generative engine called Variational Autoencoders (VAE). VAE utilizes a relatively simple architecture constrained in processing power for complex image inputs. Therefore, the image reconstruction error is high. Recent research in VAEs has shown poor reconstruction quality. This paper proposes a Nouveau VAE (NVAE) based World Models to address the abovementioned limitations. NVAE, which employs deep convolutions in its architecture, is employed as the visual sensory component of the World Models and is used to encode the environment dynamics into a latent representation. We show that NVAE-based World Models perform exceptionally well in the dream environment of car racing-v2 (OpenAI GYM env), improving the agent's performance by 45%. We then demonstrate that the NVAE-based World Models can be applied to robotic simulation environments like panda-gym, where the agent achieved a 95 % success rate in solving the reach task.
Keyless entry systems in cars are adopting neural networks for localizing its operators. Using test-time adversarial defences equip such systems with the ability to defend against adversarial attacks without prior tra...
详细信息
COVID-19 illness and death has disproportionately impacted marginalized groups the world over. In the United States, Black and Indigenous people have endured the largest risk of death. Disabled and chronically ill peo...
COVID-19 illness and death has disproportionately impacted marginalized groups the world over. In the United States, Black and Indigenous people have endured the largest risk of death. Disabled and chronically ill people have continued to isolate as their peers “return to normal”, bearing sole liability for their own safety in a society that deems their lives not worth the “sacrifice” of public health measures. While public and institutional policy makers bare personal responsibility for “survival of the fittest” approaches to public health, data science and visualization has contributed to and legitimized many of these eugenic policy decisions through design tropes I characterize as ‘eugenic visuality’. In this paper, I explore how inadequacies and obscurities in COVID-19 data visualization have contributed to and sustained public narratives that devalue marginalized lives for the comfort of white-supremacist and capitalist social norms. While I focus on visualizations and statements provided by the CDC, the implications extend beyond any individual or institution to our collective preconceptions and values. Namely, unexamined biases and unquestioned norms are embedded in data science and visualization, constraining how data is represented and interpreted. These assumptions limit how data can be leveraged in the pursuit of just social policy. Therefore, I propose guiding principles for a Just Visuality in data science and representation, supported by the work of disabled activists and scholars of color.
暂无评论