In this work, we present a novel solution aimed at improving robotic manipulators' performance in contact tasks. Inspired by the human motor control system, which relies on a feedforward mechanism to anticipate an...
详细信息
In this paper, we utilize hyperspheres and regular n-simplexes and propose an approach to learning deep features equivariant under the transformations of nD reflections and rotations, encompassed by the powerful group...
详细信息
In this paper, we utilize hyperspheres and regular n-simplexes and propose an approach to learning deep features equivariant under the transformations of nD reflections and rotations, encompassed by the powerful group of O(n). Namely, we propose O(n)-equivariant neurons with spherical decision surfaces that generalize to any dimension n, which we call Deep Equivariant Hyperspheres. We demonstrate how to combine them in a network that directly operates on the basis of the input points and propose an invariant operator based on the relation between two points and a sphere, which as we show, turns out to be a Gram matrix. Using synthetic and real-world data in nD, we experimentally verify our theoretical contributions and find that our approach is superior to the competing methods for O(n)-equivariant benchmark datasets (classification and regression), demonstrating a favorable speed/performance trade-off. The code is available on GitHub. Copyright 2024 by the author(s)
作者:
Shirzi, Moteaal AsadiKermani, Mehrdad R.Western University
Advanced Robotics and Mechatronic Systems Laboratory Electrical and Computer Engineering Department LondonONN6A 5B9 Canada Western University
Advanced Robotics and Mechatronic Systems Laboratory The Department of Electrical and Computer Engineering LondonONN6A 5B9 Canada
In this article, we propose a new algorithm to improve plant recognition through the use of feature descriptors. The accurate results from this identification method are essential for enabling autonomous tasks, such a...
详细信息
Despite the effectiveness of vision-language supervised fine-tuning in enhancing the performance of vision large language models(VLLMs), existing visual instruction tuning datasets include the following limitations.(1...
详细信息
Despite the effectiveness of vision-language supervised fine-tuning in enhancing the performance of vision large language models(VLLMs), existing visual instruction tuning datasets include the following limitations.(1) Instruction annotation quality: despite existing VLLMs exhibiting strong performance,instructions generated by those advanced VLLMs may still suffer from inaccuracies, such as hallucinations.(2) Instructions and image diversity: the limited range of instruction types and the lack of diversity in image data may impact the model's ability to generate diversified and closer to real-world scenarios outputs. To address these challenges, we construct a high-quality, diverse visual instruction tuning dataset MMInstruct,which consists of 973k instructions from 24 domains. There are four instruction types: judgment, multiplechoice, long visual question answering, and short visual question answering. To construct MMInstruct, we propose an instruction generation data engine that leverages GPT-4V, GPT-3.5, and manual correction. Our instruction generation engine enables semi-automatic, low-cost, and multi-domain instruction generation at 1/6 the cost of manual construction. Through extensive experiment validation and ablation experiments,we demonstrate that MMInstruct could significantly improve the performance of VLLMs, e.g., the model fine-tuning on MMInstruct achieves new state-of-the-art performance on 10 out of 12 benchmarks. The code and data shall be available at https://***/yuecao0119/MMInstruct.
This article introduces a novel mechatronic system for coupling the stems of seedlings and plants to wooden stakes or ropes, a crucial process for supporting them during growth, transportation, and fruiting in plant p...
详细信息
With increasingly challenging applications for quadrotors, higher requirements are emerging for tracking accuracy and safety. While high accuracy is a prerequisite for complex tasks, safety is ensured through toleranc...
详细信息
According to WHO's report from 2021, Drowning is the 3rd leading cause of unintentional death worldwide. The use of autonomous drones for drowning recognition can increase the survival rate and help lifeguards and...
详细信息
computervision has proven itself capable of accurately detecting and classifying objects within images. This also works in cases where images are used as a way of representing data, without being actual photographs. ...
详细信息
This paper presents a system design for a smart bike helmet with multiple safety features that are intended to empower bicycle riders to proactively avoid potential sources of danger or injury. A Smart Sensor/Actuator...
详细信息
In today's advanced technological age, characterized by innovations like big data processing, cloud computing, and the Internet of Things (IoT), there is a rising utilization of medical multimedia data, especially...
详细信息
In today's advanced technological age, characterized by innovations like big data processing, cloud computing, and the Internet of Things (IoT), there is a rising utilization of medical multimedia data, especially medical images. These images, integral to the Internet of Healthcare Things (IoHT), necessitate secure transmission due to the increasing risks of unauthorized breaches and tampering. Current security methods, especially for cloud and mobile platforms, often struggle with challenges related to processing capacity, memory use, data size, and energy, making them ill-suited for extensive medical data or resource-limited environments. To address these challenges, this study introduces a novel hybrid cryptosystem, drawing on the unique qualities of the optical Arnold chaotic map, DNA (DeoxyriboNucleic Acid) sequences, and Mandelbrot keys, providing a fortified approach to the secure streaming of medical images. The proposed framework operates via a precise and structured procedure. It begins by applying the optical Arnold chaotic map cipher to each of the three-color channels (R, G, and B) within a medical image. This is followed by overlaying DNA encoding sequences on the resultant encrypted image from the earlier ciphering phase. Leveraging this groundwork, we incorporate an advanced Mandelbrot set-driven shift mechanism specifically designed to create complex confusion patterns within the R, G, and B segments of the encrypted medical imagery. The efficacy of the proposed cryptosystem is rigorously substantiated through an extensive array of simulations supported by a comprehensive security analysis. The results highlight its unparalleled resilience and security capabilities in the realm of medical image encryption, marking a significant leap over previous systems in the literature. Essentially, our work pioneers a solution to a pressing challenge in medical image security, ensuring enhanced protection of delicate health data among the rapidly evolving advanc
暂无评论