检索结果-内蒙古大学图书馆

21st International Symposium on Applied Reconfigurable Computing, ARC 2025

作者： Danilowicz, Michal Kryjak, Tomasz Embedded Vision Systems Group Computer Vision Laboratory Department of Automatic Control and Robotics AGH University of Science and Technology Krakow Poland

ISBN: (纸本)9783031879944

Multi-object tracking (MOT) is one of the most important problems in computer vision and a key component of any vision-based perception system used in advanced autonomous mobile robotics. Therefore, its implementation on low-power and real-time embedded platforms is highly desirable. Modern MOT algorithms should be able to track objects of a given class (e.g. people or vehicles). In addition, the number of objects to be tracked is not known in advance, and they may appear and disappear at any time, as well as be obscured. For these reasons, the most popular and successful approaches have recently been based on the tracking paradigm. Therefore, the presence of a high quality object detector is essential, which in practice accounts for the vast majority of the computational and memory complexity of the whole MOT system. In this paper, we propose an FPGA (Field-Programmable Gate Array) implementation of an embedded MOT system based on a quantized YOLOv8 detector and the SORT (Simple Online Realtime Tracker) tracker. We use a modified version of the FINN framework to utilize external memory for model parameters and to support operations necessary required by YOLOv8. We discuss the evaluation of detection and tracking performance using the COCO and MOT15 datasets, where we achieve 0.21 mAP and 38.9 MOTA respectively. As the computational platform, we use an MPSoC system (Zynq UltraScale+ device from AMD/Xilinx) where the detector is deployed in reprogrammable logic and the tracking algorithm is implemented in the processor system. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

Energy Efficient Hardware Acceleration of Neural Networks with Power-of-Two Quantisation

Energy Efficient Hardware Acceleration of Neural Networks wi...

引用

International Conference on computer vision and Graphics, ICCVG 2022

作者： Przewlocka-Rus, Dominika Kryjak, Tomasz Embedded Vision Systems Group Computer Vision Laboratory Department of Automatic Control and Robotics AGH University of Science and Technology Krakow Poland

ISBN: (纸本)9783031220241

Deep neural networks virtually dominate the domain of most modern vision systems, providing high performance at a cost of increased computational complexity. Since for those systems it is often required to operate both in real-time and with minimal energy consumption (e.g., for wearable devices or autonomous vehicles, edge Internet of Things (IoT), sensor networks), various network optimisation techniques are used, e.g., quantisation, pruning, or dedicated lightweight architectures. Due to the logarithmic distribution of weights in neural network layers, a method providing high performance with significant reduction in computational precision (for 4-bit weights and less) is the Power-of-Two (PoT) quantisation (and therefore also with a logarithmic distribution). This method introduces additional possibilities of replacing the typical for neural networks Multiply and ACcumulate (MAC—performing, e.g., convolution operations) units, with more energy-efficient Bitshift and ACcumulate (BAC). In this paper, we show that a hardware neural network accelerator with PoT weights implemented on the Zynq UltraScale + MPSoC ZCU104 SoC FPGA can be at least 1.4x more energy efficient than the uniform quantisation version. To further reduce the actual power requirement by omitting part of the computation for zero weights, we also propose a new pruning method adapted to logarithmic quantisation. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

Traffic Sign Classification Using Deep and Quantum Neural Networks

Traffic Sign Classification Using Deep and Quantum Neural Ne...

引用

International Conference on computer vision and Graphics, ICCVG 2022

作者： Kuros, Sylwia Kryjak, Tomasz Embedded Vision Systems Group Computer Vision Laboratory Department of Automatic Control and Robotics AGH University of Science and Technology Krakow Poland

ISBN: (纸本)9783031220241

Quantum Neural Networks (QNNs) are an emerging technology that can be used in many applications including computer vision. In this paper, we presented a traffic sign classification system implemented using a hybrid quantum-classical convolutional neural network. Experiments on the German Traffic Sign Recognition Benchmark dataset indicate that currently QNN do not outperform classical DCNN (Deep Convolutuional Neural Networks), yet still provide an accuracy of over 90% and are a definitely promising solution for advanced computer vision. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Traffic signs

来源：评论

学校读者我要写书评

暂无评论

A Review of Deep Learning Techniques for Glaucoma Detection

引用

SN computer science 2023年第3期4卷 274页

作者： Guergueb, Takfarines Akhloufi, Moulay A. Perception Robotics and Intelligent Machines Research Group (PRIME) Computer Science Department Université de Moncton Moncton NB Canada

Glaucoma is one of the major reasons for visual impairment all across the globe. The recent advancements in machine learning techniques have greatly facilitated ophthalmologists in the early diagnosis of ocular diseases through the employment of automated systems. Several studies have been published lately to address the timely detection of glaucoma using deep learning approaches. A comprehensive review of the deep learning approaches employed for glaucoma detection using retinal fundus images is presented in this paper. The available retinal image datasets, image pre-processing techniques, state-of-the-art models, and performance evaluation metrics used in the recent studies are reviewed. This systematic review aims to provide critical insights and potential research directions to the ophthalmologists and researchers in this domain. © 2023, The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd.

关键词： Deep learning Eye diseases Glaucoma Image processing Machine learning Ophthalmology Transfer learning

来源：评论

学校读者我要写书评

暂无评论

PointPillars Backbone Type Selection for Fast and Accurate LiDAR Object Detection

PointPillars Backbone Type Selection for Fast and Accurate L...

引用

International Conference on computer vision and Graphics, ICCVG 2022

作者： Lis, Konrad Kryjak, Tomasz Embedded Vision Systems Group Computer Vision Laboratory Department of Automatic Control and Robotics AGH University of Science and Technology Al. Mickiewicza 30 Krakow30-059 Poland

ISBN: (纸本)9783031220241

3D object detection from LiDAR sensor data is an important topic in the context of autonomous cars and drones. In this paper, we present the results of experiments on the impact of backbone selection of a deep convolutional neural network on detection accuracy and computation speed. We chose the PointPillars network, which is characterised by a simple architecture, high speed, and modularity that allows for easy expansion. During the experiments, we paid particular attention to the change in detection efficiency (measured by the mAP metric) and the total number of multiply-addition operations needed to process one point cloud. We tested 10 different convolutional neural network architectures that are widely used in image-based detection problems. For a backbone like MobilenetV1, we obtained an almost 4x speedup at the cost of a 1.13% decrease in mAP. On the other hand, for CSPDarknet we got an acceleration of more than 1.5x at an increase in mAP of 0.33%. We have thus demonstrated that it is possible to significantly speed up a 3D object detector in LiDAR point clouds with a small decrease in detection efficiency. This result can be used when PointPillars or similar algorithms are implemented in embedded systems, including SoC FPGAs. The code is available at https://***/vision-agh/pointpillars-backbone. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Object detection

来源：评论

学校读者我要写书评

暂无评论

EfficientNet-SAM: A Novel EffecientNet with Spatial Attention Mechanism for COVID-19 Detection in Pulmonary CT Scans

EfficientNet-SAM: A Novel EffecientNet with Spatial Attentio...

引用

IEEE computer Society Conference on computer vision and Pattern Recognition Workshops (CVPRW)

作者： Ramy Farag Parth Upadhay Jacket Demby’s Yixiang Gao Katherin Garces Montoya Seyed Mohamad Ali Tousi Gbenga Omotara Guilherme DeSouza Department of Electrical Engineering and Computer Science Vision-Guided and Intelligent Robotics Lab - ViGIR Lab University of Missouri-Columbia

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Manual analysis and diagnosis of COVID-19 through the examination of Computed Tomography (CT) images of the lungs can be time-consuming and result in errors, especially given high volume of patients and numerous images per patient. So, we address the need for automation of this task by developing a new deep learning-based pipeline. Our motivation was sparked by the CVPR Workshop on "Domain Adaptation, Explainability and Fairness in AI for Medical Image Analysis", more specifically, the "COVID-19 Diagnosis Competition (DEF-AI-MIA COV19D)" under the same Workshop. This challenge provides an opportunity to assess our proposed pipeline for COVID-19 detection from CT scan images. The same pipeline incorporates one of the architectures in the EfficientNet "family", but with an added Spatial Attention Mechanism: EfficientNet-SAM. Also, unlike the traditional/past pipelines, which relied on a preprocessing step, our pipeline takes the raw selected input images without any such step, except for an image-selection step to simply reduce the number of CT images required for training and/or testing. Moreover, our pipeline is computationally efficient, as, for example, it does not incorporate a decoder for segmenting the lungs. It also does not combine different models nor combine RNN with a backbone, as other pipelines in the past did. Nevertheless, our pipeline outperformed all approaches presented by other teams in last year’s instance of the same challenge using the validation subset. It also placed 5th in this year’s competition, ranking less than 1.3% below the 1st place and close to 3.5% above the 6th place based on the macro-F1 score.

关键词： COVID-19 Training Visualization Attention mechanisms Computed tomography Conferences Pipelines

来源：评论

学校读者我要写书评

暂无评论

Onboard Perception-Assisted High Fidelity Simulation Framework for Autonomous Planetary Soft-Landing

Onboard Perception-Assisted High Fidelity Simulation Framewo...

引用

AIAA science and Technology Forum and Exposition, AIAA SciTech Forum 2025

作者： Alibekov, Ulugbek Banerjee, Avijit Satpute, Sumeet Gajanan Nikolakopoulos, George ERASMUS MUNDUS JOINT MASTER in Intelligent Field Robotic Systems Faculty of Informatics Eötvös Loránd University Budapest Hungary Robotics and AI research group Department of Computer Science Electrical and Space Engineering Luleå University of Technology Luleå Sweden

ISBN: (数字)9781624107238

ISBN: (纸本)9781624107238

This article presents an onboard perception-assisted high-fidelity simulation framework for autonomous planetary soft-landing, enabling visual information processing tightly integrated with advanced onboard guidance system in a realistic simulation environments. Utilizing the open source Unity game engine, a physics-based simulation toolbox integrated with the Robotic Operating System (ROS2), we effectively emulate the motion of spacecraft approaching planetary terrain. The simulation environment features a realistic 3D terrain model and a lander spacecraft equipped with onboard camera to effectively capture detailed local terrain images. The simulation engine accurately captures the spacecraft’s 6-degrees of freedom motion, seamlessly integrating space craft orientation with the associated camera feed while incorporating various illumination conditions using a light source representing the Sun. To demonstrate the efficacy of the framework, we implement a recently developed attitude-constrained minimum jerk guidance algorithm to emulate spacecraft motion. Concurrently, the visual feed from the onboard RGB camera is processed through a gradient extraction-based perception system, providing the visual odometer. This system identifies provably safe landing sites, effectively avoiding potential risks of landing on uneven terrain. © 2025, American Institute of Aeronautics and Astronautics Inc, AIAA. All rights reserved.

关键词： Planetary landers

来源：评论

学校读者我要写书评

暂无评论

Inverse Kinematics of Robotic Manipulators Using a New Learning-by-Example Method

Inverse Kinematics of Robotic Manipulators Using a New Learn...

引用

IEEE/RSJ International Conference on intelligent Robots and Systems (IROS)

作者： Jacket Demby’s Ramy Farag Guilherme N. DeSouza Department of Electrical Engineering and Computer Science (EECS) Vision-Guided and Intelligent Robotics (ViGIR) Laboratory University of Missouri-Columbia Columbia Missouri

ISBN: (数字)9798350377705

ISBN: (纸本)9798350377712

Inverse Kinematics (IK) is one of the most fundamental challenges in robotics. It refers to the process of determining the joint configurations required to achieve the desired position and orientation (pose) of a robot end-effector. Although numerous Data-Driven (DD) IK solvers have demonstrated encouraging results, they have not achieved the same accuracy when compared to other IK methods for complex robot configurations (e.g., numerical methods for higher Degrees of Freedom (DoF)). In this work, we propose a new Learning-by-Example method, and show that such a scheme considerably improves the IK learning results when compared to other DD learners. In our approach, the network input incorporates an example of joint-pose pair along with the query pose to predict the desired robot joint configuration. We show that the example joint-pose pair does not need to be too close to the query – i.e. example and query can be as far as 20 degrees apart in the joint configuration space. Furthermore, we investigate the utilization of residual and dense skip connections in Multilayer Perceptron for DDIK solvers and employ the resulting networks for two redundant robotic manipulators: a 7-DoF-7R commensurate robot and a 7DoF-2RP4R incommensurate robot. Our experimental results show that the resulting DDIK solver can reliably predict IK solutions with accuracy better than 1mm in position and 1deg in orientation.

关键词： Hands Accuracy Kinematics Network architecture Multilayer perceptrons Numerical models Reliability Collision avoidance Robots intelligent robots

来源：评论

学校读者我要写书评

暂无评论

Traffic Sign Detection With Event Cameras and DCNN 25

Traffic Sign Detection With Event Cameras and DCNN

引用

25th IEEE Signal Processing: Algorithms, Architectures, Arrangements, and Applications, SPA 2022

作者： Wzorek, Piotr Kryjak, Tomasz Embedded Vision Systems Group Agh University of Science and Technology Computer Vision Laboratory Department of Automatic Control and Robotics Kraków Poland Silesian University of Technology Department of Digital Systems Gliwice Poland

ISBN: (数字)9788362065424

ISBN: (纸本)9788362065424

In recent years, event cameras (DVS - Dynamic vision Sensors) have been used in vision systems as an alternative or supplement to traditional cameras. They are characterised by high dynamic range, high temporal resolution, low latency, and reliable performance in limited lighting conditions - parameters that are particularly important in the context of advanced driver assistance systems (ADAS) and self-driving cars. In this work, we test whether these rather novel sensors can be applied to the popular task of traffic sign detection. To this end, we analyse different representations of the event data: event frame, event frequency, and the exponentially decaying time surface, and apply video frame reconstruction using a deep neural network called FireNet. We use the deep convolutional neural network YOLOv4 as a detector. For particular representations, we obtain a detection accuracy in the range of 86.9-88.9% mAP@0.5. The use of a fusion of the considered representations allows us to obtain a detector with higher accuracy of 89.9% mAP@0.5. In comparison, the detector for the frames reconstructed with FireNet is characterised by an accuracy of 72.67% mAP@0.5. The results obtained illustrate the potential of event cameras in automotive applications, either as standalone sensors or in close cooperation with typical frame-based cameras. © 2022 Division of Signal Processing and Electronic Systems, Poznan University of Technology (DSPES PUT).

关键词： Cameras

来源：评论

学校读者我要写书评

暂无评论

Entropy-Guided Reinforced Open World Active 3D Object Detection Learning

Entropy-Guided Reinforced Open World Active 3D Object Detect...

引用

2024 China Automation Congress, CAC 2024

作者： Zhang, Haozhe Ma, Liyan Ying, Shihui Institute of Artificial Intelligence Shanghai University Shanghai200444 China School of Computer Engineering and Science School of Mechatronic Engineering and Automation Shanghai Key Laboratory of Intelligent Manufacturing and Robotics Shanghai University Shanghai200444 China Department of Mathematics School of Science Shanghai University Shanghai200444 China

ISBN: (纸本)9798350368604

Traditional fully annotated closed set 3D object detection methods improve model performance but are impractical in real-world settings due to the emergence of new categories and the complexity of 3D annotations. Open-World Object Detection (OWOD) addresses these issues but relies heavily on manual labeling, which is costly. This paper focuses on open world active learning and proposes an entropy-guided reinforced open world active 3D object detection (EROA). EROA regards active learning as a reinforcement learning problem tailored for open driving scenarios. We use entropy as a reward metric for efficient reinforcement learning. We also leverage knowledge from the 2D domain using object-level large-scale vision-language models to enhance sample selection. Extensive experiments evidence that the proposed EROA meets the dynamic and cost-sensitive requirements of autonomous driving, enabling real-time detection of both known and unknown objects. © 2024 IEEE.

关键词： Active learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：