检索结果-内蒙古大学图书馆

Color Image Segmentation Based on Modified Kuramoto Model

学校读者我要写书评

暂无评论

Procedia Computer Science 2016年 88卷 245-258页

作者： Xiaojie Liu Yuanhua Qiao Xianghui Chen Jun Miao Lijuan Duan College of Applied Sciences Beijing University of Technology Beijing 100124 China Key Lab of Intelligent Information Processing of Chinese Academy of Sciences (CAS) Institute of Computing Technology CAS Beijing 100190 China College of Computer Science and Technology Beijing University of Technology Beijing 100124 China

A new approach for color image segmentation is proposed based on Kuramoto model in this paper. Firstly, the classic Kuramoto model which describes a global coupled oscillator network is changed to be one that is locally coupled to simulate the neuron activity in visual cortex and to describe the influence for phase changing by external stimuli. Secondly, a rebuilt method of coupled neuron activities is proposed by introducing and computing instantaneous frequency. Three oscillating curves corresponding to the pixel values of R, G, B for color image are formed by the coupled network and are added up to produce the superposition of oscillation. Finally, color images are segmented according to the synchronization of the oscillating superposition by extracting and checking the frequency of the oscillating curves. The performance is compared with that from other representative segmentation approaches.

关键词： Kuramoto model Neural Network Color image segmentation

Zero sum sets in abelian groups

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Shi, Minjia Krotov, Denis Li, Xiaoxiao Solé, Patrick Key Laboratory of Intelligent Computing and Signal Processing of Ministry of Education School of Mathematics Sciences Anhui University HefeiAnhui China Sobolev Institute of Mathematics Novosibirsk630090 Russia I2M CNRS Centrale Marseille University of Aix-Marseille Marseilles France

The distribution of cardinalities of zero-sum sets in abelian groups is completely determined. A summation involving the Möbius function is given for the general abelian group, while in many special cases, including the case of elementary abelian groups, solved earlier by Li and Wan, it has a compact form. The proof involves two different Möbius transforms, on positive integers and on set *** Codes 05A18, 05E40, 68P27 Copyright © 2021, The Authors. All rights reserved.

关键词： Group theory

OmniVL: One Foundation Model for Image-Language and Video-Language Tasks

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Wang, Junke Chen, Dongdong Wu, Zuxuan Luo, Chong Zhou, Luowei Zhao, Yucheng Xie, Yujia Liu, Ce Jiang, Yu-Gang Yuan, Lu Shanghai Key Lab of Intell. Info. Processing School of Cs Fudan University China Shanghai Collaborative Innovation Center on Intelligent Visual Computing China Microsoft Cloud + Ai Microsoft Research Asia China

This paper presents OmniVL, a new foundation model to support both image-language and video-language tasks using one universal architecture. It adopts a unified transformer-based visual encoder for both image and video inputs, and thus can perform joint image-language and video-language pretraining. We demonstrate, for the first time, such a paradigm benefits both image and video tasks, as opposed to the conventional one-directional transfer (e.g., use image-language to help video-language). To this end, we propose a decoupled joint pretraining of image-language and video-language to effectively decompose the vision-language modeling into spatial and temporal dimensions and obtain performance boost on both image and video tasks. Moreover, we introduce a novel unified vision-language contrastive (UniVLC) loss to leverage image-text, video-text, image-label (e.g., image classification), video-label (e.g., video action recognition) data together, so that both supervised and noisily supervised pretraining data are utilized as much as possible. Without incurring extra task-specific adaptors, OmniVL can simultaneously support visual only tasks (e.g., image classification, video action recognition), cross-modal alignment tasks (e.g., image/video-text retrieval), and multi-modal understanding and generation tasks (e.g., image/video question answering, captioning). We evaluate OmniVL on a wide range of downstream tasks and achieve state-of-the-art or competitive results with similar model size and data scale. Copyright © 2022, The Authors. All rights reserved.

关键词： Image classification

Gaussian-Hermite Moment Invariants of General Multi-Channel Functions

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Mo, Hanlin Li, Hua Zhao, Guoying The Center for Machine Vision and Signal Analysis University of Oulu OuluFI-90014 Finland The Key lab of Intelligent Information Processing The Institute of Computing Technology Chinese Academy of Sciences Beijing100190 China University of Chinese Academy of Sciences Beijing100049 China The School of Information and Technology Northwest University Xi'An710069 China

With the development of data acquisition technology, large amounts of multi-channel data are collected and widely used in many fields. Most of them, such as RGB images and vector fields, can be expressed as different types of multi-channel functions. Feature extraction of multi-channel data for identifying interest patterns is a critical but challenging task. This paper focuses on constructing moment-based features of general multi-channel functions. Specifically, we define two transform models, rotation-affine transform and total rotation transform, to describe real deformations of multi-channel data. Then, we design a structural framework to generate Gaussian-Hermite moment invariants for these two transform models systematically. It is the first time that a unified framework has been proposed in the literature to construct orthogonal moment invariants of general multi-channel functions. Given a specific type of multi-channel data, we demonstrate how to utilize the new method to derive all possible invariants and eliminate dependences among them. We obtain independent sets of invariants with low orders and low degrees for RGB images, 2D vector fields and color volume data. Based on synthetic and real multi-channel data, we conduct extensive experiments to evaluate the stability and discriminability of these invariants and their robustness to noise. The results show that new moment invariants significantly outperform previous moment invariants of multi-channel data in RGB image classification and vortex detection in 2D vector fields. Copyright © 2022, The Authors. All rights reserved.

关键词： Image classification

Optimized symplectic scheme for electromagnetic simulations

学校读者我要写书评

暂无评论

Optimized symplectic scheme for electromagnetic simulations

Asia-Pacific Conference on Microwave

作者： Bo Wu Zhi-Xiang Huang Wei Sha Ming-Sheng Chen Hong Dai School of physical science and technology Yunnan University China Key Laboratory of Intelligent Computing & Signal Processing Anhui University China Department of Electrical and Electronic Engineering University of Hong Kong Hong Kong China Department of Physics and Electronic Engineering Hefei Teachers College China

Classical finite-difference time-domain (FDTD) method has been widely used in computational electromagnetics, but for electrically large domains and for late-time analysis, FDTD method begins to show its limitations due to the accumulation of phase errors. To solve this problem, several methods have been proposed such as high-order schemes and four-stage Runge-Kutta integrator. Recently, the symplectic methods have been adopted for using in computational electromagnetics. In this paper, the concentration is on the derivation of an optimized fourth-order symplectic scheme in electromagnetic simulations.

关键词： Computational modeling Finite difference methods Permittivity Permeability Maxwell equations Electromagnetic propagation Differential equations Optimization methods Physics computing signal processing

Prototypical Residual Networks for Anomaly Detection and Localization

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Zhang, Hui Wu, Zuxuan Wang, Zheng Chen, Zhineng Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China School of Computer Science Zhejiang University of Technology China

Anomaly detection and localization are widely used in industrial manufacturing for its efficiency and effectiveness. Anomalies are rare and hard to collect and supervised models easily over-fit to these seen anomalies with a handful of abnormal samples, producing unsatisfactory performance. On the other hand, anomalies are typically subtle, hard to discern, and of various appearance, making it difficult to detect anomalies and let alone locate anomalous regions. To address these issues, we propose a framework called Prototypical Residual Network (PRN), which learns feature residuals of varying scales and sizes between anomalous and normal patterns to accurately reconstruct the segmentation maps of anomalous regions. PRN mainly consists of two parts: multi-scale prototypes that explicitly represent the residual features of anomalies to normal patterns;a multi-size self-attention mechanism that enables variable-sized anomalous feature learning. Besides, we present a variety of anomaly generation strategies that consider both seen and unseen appearance variance to enlarge and diversify anomalies. Extensive experiments on the challenging and widely used MVTec AD benchmark show that PRN outperforms current state-of-the-art unsupervised and supervised methods. We further report SOTA results on three additional datasets to demonstrate the effectiveness and generalizability of PRN. Copyright © 2022, The Authors. All rights reserved.

关键词： Anomaly detection

GenRec: Unifying Video Generation and Recognition with Diffusion Models

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Weng, Zejia Yang, Xitong Xing, Zhen Wu, Zuxuan Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center of Intelligent Visual Computing China Department of Computer Science University of Maryland United States

Video diffusion models are able to generate high-quality videos by learning strong spatial-temporal priors on large-scale datasets. In this paper, we aim to investigate whether such priors derived from a generative process are suitable for video recognition, and eventually joint optimization of generation and recognition. Building upon Stable Video Diffusion, we introduce GenRec, the first unified framework trained with a random-frame conditioning process so as to learn generalized spatial-temporal representations. The resulting framework can naturally supports generation and recognition, and more importantly is robust even when visual inputs contain limited information. Extensive experiments demonstrate the efficacy of GenRec for both recognition and generation. In particular, GenRec achieves competitive recognition performance, offering 75.8% and 87.2% accuracy on SSV2 and K400, respectively. GenRec also performs the best on class-conditioned image-to-video generation, achieving 46.5 and 49.3 FVD scores on SSV2 and EK-100 datasets. Furthermore, GenRec demonstrates extraordinary robustness in scenarios that only limited frames can be observed. Code will be available at https://***/wengzejia1/GenRec. Copyright © 2024, The Authors. All rights reserved.

关键词： Spatio-temporal data

MagDiff: Multi-Alignment Diffusion for High-Fidelity Video Generation and Editing

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Zhao, Haoyu Lu, Tianyi Gu, Jiaxi Zhang, Xing Zheng, Qingping Wu, Zuxuan Xu, Hang Jiang, Yu-Gang Shanghai Key Lab of Intell. Info. Processing School of CS Fudan University China Shanghai Collaborative Innovation Center on Intelligent Visual Computing China Huawei Noah's Ark Lab Hong Kong Zhejiang University China

The diffusion model is widely leveraged for either video generation or video editing. As each field has its task-specific problems, it is difficult to merely develop a single diffusion for completing both tasks simultaneously. Video diffusion sorely relying on the text prompt can be adapted to unify the two tasks. However, it lacks a high capability of aligning heterogeneous modalities between text and image, leading to various misalignment problems. In this work, we are the first to propose a unified Multi-alignment Diffusion, dubbed as MagDiff, for both tasks of high-fidelity video generation and editing. The proposed MagDiff introduces three types of alignments, including subject-driven alignment, adaptive prompts alignment, and high-fidelity alignment. Particularly, the subject-driven alignment is put forward to trade off the image and text prompts, serving as a unified foundation generative model for both tasks. The adaptive prompts alignment is introduced to emphasize different strengths of homogeneous and heterogeneous alignments by assigning different values of weights to the image and the text prompts. The high-fidelity alignment is developed to further enhance the fidelity of both video generation and editing by taking the subject image as an additional model input. Experimental results on four benchmarks suggest that our method outperforms the previous method on each task. Copyright © 2023, The Authors. All rights reserved.

关键词：

An Integrated High-throughput Workflow for Identification of Crosslinked Peptides from Complex Samples

学校读者我要写书评

暂无评论

An Integrated High-throughput Workflow for Identification of...

第七届中国蛋白质组学大会暨第三届国际蛋白质组学论坛

作者： Bing Yang Yan-Jie Wu Ming Zhu Jinzhong Lin Kun Zhang Shu-Kun Luo Yue-He Ding Li-Yun Xiu She Chen Keqiong Ye Si-Min He Meng-Qiu Dong National Institute of Biological Sciences BeijingBeijing 102206People''s Republic of China Key Lab of Intelligent Information ProcessingInstitute of Computing TechnologyChinese Academy of SciencesBeijing 100190People''s Republic of China