检索结果-内蒙古大学图书馆

32nd IEEE Signal processing and Communications applications Conference (SIU)

作者： Guven, Ali Yetik, Imam Samil TOBB Ekon & Teknol Univ Elekt & Elekt Muhendisligi Bolumu Ankara Turkiye

ISBN: (纸本)9798350388978;9798350388961

Recently, providing real-time navigation of unmanned aerial vehicles independent of global positioning systems has become of great importance. The state-of-the-art methods based on deep learning, which give good results in certain datasets, and the existing methods can not provide real-time and good solutions on images with dynamic and fast moving. Moreover, the methods, were developed so far, were focused on object-based tracking algorithms. In this paper, the tracking of the points belonging to the target pattern, found by image matching, was performed with the machine learning model we developed for 10 sequential video images. The features extracted for the machine learning model are: (i) the change between the points of the previous image and the image before that, (ii) the points of interest in the previous image, (iii) the changes found with the homography matrix between sequential images. It was experimentally shown that, point tracking can be achieved with the least error, on avarage about 23 pixels for a 2 mega-pixel resolution image, among the algorithms in the literature that can process more than 30 images per second in a CPU environment of 2 GHz or above.

关键词： image matching image homography image-object tracking image-point tracking apattern detection machine learning image processing feature extraction

来源：评论

学校读者我要写书评

暂无评论

Amber: A 16-nm System-on-Chip With a Coarse-Grained Reconfigurable Array for Flexible Acceleration of Dense Linear Algebra

引用

IEEE JOURNAL OF SOLID-STATE CIRCUITS 2024年第3期59卷 947-959页

作者： Feng, Kathleen Kong, Taeyoung Koul, Kalhan Melchert, Jackson Carsello, Alex Liu, Qiaoyi Nyengele, Gedeon Strange, Maxwell Zhang, Keyi Nayak, Ankita Setter, Jeff Thomas, James Sreedhar, Kavya Chen, Po-Han Bhagdikar, Nikhil Myers, Zach A. D'Agostino, Brandon Joshi, Pranil Richardson, Stephen Torng, Christopher Horowitz, Mark Raina, Priyanka Stanford Univ Dept Elect Engn Stanford CA 94305 USA Stanford Univ Dept Comp Sci Stanford CA 94305 USA

Amber is a system-on-chip (SoC) with a coarse-grained reconfigurable array (CGRA) for acceleration of dense linear algebra applications, such as machine learning (ML), image processing, and computer vision. It is designed using an agile accelerator-compiler codesign flow;the compiler updates automatically with hardware changes, enabling continuous application-level evaluation of the hardware-software system. To increase hardware utilization and minimize reconfigurability overhead, Amber features the following: 1) dynamic partial reconfiguration (DPR) of the CGRA for higher resource utilization by allowing fast switching between applications and partitioning resources between simultaneous applications;2) streaming memory controllers supporting affine access patterns for efficient mapping of dense linear algebra;and 3) low-overhead transcendental and complex arithmetic operations. The physical design of Amber features a unique clock distribution method and timing methodology to efficiently layout its hierarchical and tile-based design. Amber achieves a peak energy efficiency of 538 INT16 GOPS/W and 483 BFloat16 GFLOPS/W. Compared with a CPU, a GPU, and a field-programmable gate array (FPGA), Amber has up to 3902x , 152x, and 107x better energy-delay product (EDP), respectively.

关键词： Hardware Field programmable gate arrays Switches Registers Random access memory Multiplexing Linear algebra Coarse-grained reconfigurable array (CGRA) computer architecture computer vision image processing machine learning (ML) reconfigurable accelerators system-on-chip (SoC)

来源：评论

学校读者我要写书评

暂无评论

Simple color appearance model (sCAM) based on simple uniform color space (sUCS)

引用

OPTICS EXPRESS 2024年第3期32卷 3100-3122页

作者： Li, M. Luo, M. R. Zhejiang Univ State Key Lab Extreme Photon & Instrumentat Hangzhou 310027 Peoples R China Univ Leeds Sch Design Leeds LS2 9JT England

A new color appearance model named sCAM has been developed, including a uniform color space, sUCS. The model has a simple structure but provides comprehensive functions for color related applications. It takes input from either XYZ D65 or signals from an RGB space. Their accuracy has been extensively tested. sUCS performed the best or second-best to the overall 28 datasets for space uniformity and the 6 datasets for hue linearity comparing the state of the art UCSs. sCAM also performed the best to fit all available one- and two-dimensional color appearance datasets. It is recommended to have field tests for all color related applications.

关键词： CIE color spaces Color difference Color spaces image enhancement machine vision Signal processing

来源：评论

学校读者我要写书评

暂无评论

SAMStyler: Enhancing Visual Creativity With Neural Style Transfer and Segment Anything Model (SAM)

引用

IEEE ACCESS 2023年 11卷 100256-100267页

作者： Psychogyios, Konstantinos Leligou, Helen C. Melissari, Filisia Bourou, Stavroula Anastasakis, Zacharias Zahariadis, Theodore Synelixis Solut SA Chalkida 34100 Greece Netco Intrasoft SA GR-19002 Paiania Greece Natl & Kapodistrian Univ Athens Dept Agr Dev Agrifood & Nat Resources Management GR-15772 Athens Greece

Neural Style Transfer (NST) is a popular technique of computer vision where the content of an image is blended with the style of another, which results in a fused image with certain properties of both original images. This approach has practical applications in various domains and has garnered significant attention in both industry and academia. An interesting application of this technique is segmented style transfer where a segmentation algorithm is used to locate objects within an image and then the style transfer method is performed locally, producing images with different styles for different objects. This approach opens up possibilities for creating visually striking compositions by seamlessly blending various artistic styles onto specific objects within an image, allowing for a new level of creative expression. This paper proposes a novel method that combines Segment Anything Model (SAM), a state-of-the-art vision transformer-based image segmentation model developed by Facebook, with style transfer. Our approach includes performing localized style transfer in selected segmentation regions of an image using classical style transfer algorithms. To ensure smooth transitions between the stylized and non-stylized border we also develop our loss function with a border smoothing technique. Experimental results demonstrate the robustness and effectiveness of the proposed methodology, including the ability to infuse multiple artistic styles into different objects within an image. The contributions of this work include integrating SAM with style transfer, proposing a novel loss function, evaluating the segmented style transfer in multiple content regions, comparing with state-of-the-art approaches, and experimenting with multiple style images for diverse stylization. Our primary focus centers on creating a model that serves as a digital painter across a wide range of image genres and artistic styles.

关键词： image segmentation Visualization Semantics Computer vision Transformers Real-time systems Social networking (online) machine learning Segment anything model segment anything segmentation machine learning style transfer

来源：评论

学校读者我要写书评

暂无评论

Ipdm: identity preserving diffusion model for face sketch and photo synthesis

引用

machine vision AND applications 2025年第2期36卷 1-14页

作者： Tang, Duoxun Jiang, Xinhang Zhang, Ying Dai, Yuhang Lin, Ye Sichuan Agr Univ Coll Sci Yaan 625014 Sichuan Peoples R China Sichuan Agr Univ Coll Informat Engn Yaan 625014 Sichuan Peoples R China

Face sketch and photo synthesis is widely applied in industry and information fields, such as entertainment business and heterogeneous face retrieval. The key challenge lies in completing a face transformation with both good visual effects and face identity preservation. However, existing methods are still difficult to obtain a good synthesis due to the large model gap between the two different face domains. Recently, diffusion models have achieved great success in image synthesis, which allows us to extend its application in such a face generation task. Thus, we propose IPDM, which constructs a mapping of latent representation for domain-adaptive face features. The other proposed IDP utilizes auxiliary features to correct the latent features through their directions and supplementary identity information, so that the generation can keep face identity unchanged. The various evaluation results show that our method is superior to state-of-the-art methods in both identity preservation and visual effects.

关键词： Face sketch synthesis Identity preservation Diffusion model Generative model image processing

来源：评论

学校读者我要写书评

暂无评论

Early vision on the Focal-Plane with High Dynamic Range Pixels

Early Vision on the Focal-Plane with High Dynamic Range Pixe...

引用

2024 International Workshop on the Theory of Computational Sensing and its applications to Radar, Multimodal Sensing and Imaging, CoSeRa 2024

作者： Jaklin, Marko Garcia-Lesta, D. Lopez, P. Brea, V.M. Santiago de Compostela Spain

ISBN: (纸本)9798350365504

This paper introduces a high dynamic range pixel for early vision processing. Early vision is the first stage to subsequently extract semantic information for image processing or video analytics. This paper proposes to bring said processing to the focal plane, next to a high dynamic range image sensor working on the principle of lateral overflow capacitor. This brings the benefits of processing scenes with a wide dynamic range in a power efficient manner. Circuit simulations for edge detection, as an example of early vision processing conveyed in this paper, show that our proposal meets the accuracy typically found in applications like machine vision. Simulations are in XFAB's XS018 technology. © 2024 IEEE.

关键词： machine vision

来源：评论

学校读者我要写书评

暂无评论

Computer vision: applications of Visual AI and image processing

引用

2023年

ISBN: (数字)9783110756722;9783110756821

ISBN: (纸本)9783110756678

This book focuses on the latest developments in the fields of visual AI, image processing and computer vision. It shows research in basic techniques like image pre-processing, feature extraction, and enhancement, along with applications in biometrics, healthcare, neuroscience and forensics. The book highlights algorithms, processes, novel architectures and results underlying machine intelligence with detailed execution flow of models.

关键词： Computer Sciences

来源：评论

学校读者我要写书评

暂无评论

Annotation Tools for Computer vision Tasks 17

Annotation Tools for Computer Vision Tasks

引用

17th International Conference on machine vision, ICMV 2024

作者： Moschidis, Christos Vrochidou, Eleni Papakostas, George A. Department of Informatics Democritus University of Thrace Kavala65404 Greece

ISBN: (纸本)9781510688278

Common computer vision (CV) tasks include image classification, object detection, segmentation, and recognition. To handle such tasks, machine learning (ML) models for image processing require a great amount of annotated training data. While datasets are expanding in size and variety, annotation becomes demanding, since its quality can severely affect the models' performance. Thus, several annotation tools have been developed and designated for specific applications and model requirements. This work aims to provide an overview of the most up-to-date annotation tools for computer vision tasks, including 2D and 3D image data and video, comparatively highlighting their advantages and limitations. The appropriateness of each tool for specific tasks is emphasized, providing a reference map for researchers towards determining the annotation tool best tailored to their needs. Future trends in image annotation are also discussed. © 2025 SPIE.

关键词： image annotation

来源：评论

学校读者我要写书评

暂无评论

From Paper to Pixels: A Multi-modal Approach to Understand and Digitize Assembly Drawings for Automated Systems 35th

From Paper to Pixels: A Multi-modal Approach to Understand a...

引用

35th International Conference on Database and Expert Systems applications (DEXA)

作者： Seliger, Raphael Guel-Ficici, Sebnem Goehner, Ulrich Univ Appl Sci Kempten Inst Data Optimized Mfg IDF Bahnhofstr 61 D-87435 Kempten Germany

ISBN: (纸本)9783031683015;9783031683022

The transition to Industry 4.0 intensifies the demand for advanced manufacturing techniques and efficient data processing capabilities. A notable challenge in engineering is that many older engineering drawings are only available in paper form, creating significant barriers for modern automated systems. This study tackles these challenges by employing advanced deep-learning techniques alongside traditional image processing to convert legacy engineering drawings into structured, machine-readable formats. Following this digitization process, this multi-modal approach further processes drawings containing a lot of heterogeneous data by filtering non-essential details to isolate and extract critical features. This process enables the conversion of complex drawings into formats suitable for computer vision and deep learning applications. The structured datasets resulting from this process are then utilized to enhance the efficiency of automated processes significantly. For instance, they enable more efficient pick-and-place operations by providing the data necessary for machine learning-driven automation.

关键词： Deep Learning Computer vision Document Analysis Engineering Drawing Instance Segmentation

来源：评论

学校读者我要写书评

暂无评论

Robust visual-based method and new datasets for ego-lane index estimation in urban environment

引用

machine vision AND applications 2024年第5期35卷 112页

作者： Wang, Dianzheng Liang, Dongyi Li, Shaomiao Chinese Acad Sci Inst Engn Thermophys Beijing 100190 Peoples R China HAOMO AI Technol Co Ltd Beijing 100192 Peoples R China

Correct and robust ego-lane index estimation is crucial for autonomous driving in the absence of high-definition maps, especially in urban environments. Previous ego-lane index estimation approaches rely on feature extraction, which limits the robustness. To overcome these shortages, this study proposes a robust ego-lane index estimation framework upon only the original visual image. After optimization of the processing route, the raw image was randomly cropped in the height direction and then input into a double supervised LaneLoc network to obtain the index estimations and confidences. A post-process was also proposed to achieve the global ego-lane index from the estimated left and right indexes with the total lane number. To evaluate our proposed method, we manually annotated the ego-lane index of public datasets which can work as an ego-lane index estimation baseline for the first time. The proposed algorithm achieved 96.48/95.40% (precision/recall) on the CULane dataset and 99.45/99.49% (precision/recall) on the TuSimple dataset, demonstrating the effectiveness and efficiency of lane localization in diverse driving environments. The code and dataset annotation results will be exposed publicly on https://***/haomo-ai/LaneLoc.

关键词： Ego-lane index estimation Visual image Dataset Double supervision Urban environments Autonomous driving

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：