检索结果-内蒙古大学图书馆

Spectral MVIR: Joint reconstruction of 3D shape and spectral reflectance

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Li, Chunyu Monno, Yusuke Okutomi, Masatoshi Department of Systems and Control Engineering School of Engineering Tokyo Institute of Technology Tokyo Japan

Reconstructing an object’s high-quality 3D shape with inherent spectral reflectance property, beyond typical device-dependent RGB albedos, opens the door to applications requiring a high-fidelity 3D model in terms of both geometry and photometry. In this paper, we propose a novel Multi-View Inverse Rendering (MVIR) method called Spectral MVIR for jointly reconstructing the 3D shape and the spectral reflectance for each point of object surfaces from multi-view images captured using a standard RGB camera and low-cost lighting equipment such as an LED bulb or an LED projector. Our main contributions are twofold: (i) We present a rendering model that considers both geometric and photometric principles in the image formation by explicitly considering camera spectral sensitivity, light’s spectral power distribution, and light source positions. (ii) Based on the derived model, we build a cost-optimization MVIR framework for the joint reconstruction of the 3D shape and the per-vertex spectral reflectance while estimating the light source positions and the shadows. Different from most existing spectral-3D acquisition methods, our method does not require expensive special equipment and cumbersome geometric calibration. Experimental results using both synthetic and real-world data demonstrate that our Spectral MVIR can acquire a high-quality 3D model with accurate spectral reflectance property. Copyright © 2021, The Authors. All rights reserved.

关键词： Photometry

Multimodal Emotion Recognition Based on Multi-Scale Facial Features and Cross-Modal Attention

学校读者我要写书评

暂无评论

Multimodal Emotion Recognition Based on Multi-Scale Facial F...

IEEE International Conference on Industrial technology (ICIT)

作者： Chengao Bao Luefeng Chen Min Li Min Wu Witold Pedrycz Kaoru Hirota School of Automation China University of Geosciences Wuhan China the Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems School of Automation and the Engineering Research Center of Intelligent Technology for Geo-Exploration Ministry of Education China University of Geosciences Wuhan China Department of Electrical and Computer Engineering University of Alberta Edmonton Canada Tokyo Institute of Technology Yokohama Japan Tokyo Institute of Technology Tokyo Japan

ISBN: (数字)9798331521950

ISBN: (纸本)9798331521967

A multi-modal emotion recognition method based on facial multi-scale features and cross-modal attention (MS-FCA) network is proposed. The MSFCA model improves the traditional single-branch ViT network into a two-branch ViT architecture by using classification tokens in each branch to interact with picture embeddings in the other branch, which facilitates effective interactions between different scales of information. Subsequently, audio features are extracted using ResNet18 network. The cross-modal attention mechanism is used to obtain the weight matrices between different modal features, making full use of inter-modal correlation and effectively fusing visual and audio features for more accurate emotion recognition. Two datasets are used for the experiments: eNTERFACE'05 and REDVESS dataset. The experimental results show that the accuracy of the proposed method on the eNTERFACE'05 and REDVESS datasets is 85.42% and 83.84% respectively, which proves the effectiveness of the proposed method.

关键词： Emotion recognition Visualization Accuracy Service robots Semantics Human-robot interaction Speech recognition Feature extraction Data mining Facial features

Some ethical issues in the review process of machine learning conferences

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Russo, Alessio Division of Decision and Control Systems EECS School KTH Royal Institute of Technology Sweden

Recent successes in the Machine Learning community have led to a steep increase in the number of papers submitted to conferences. This increase made more prominent some of the issues that affect the current review process used by these conferences. The review process has several issues that may undermine the nature of scientific research, which is of being fully objective, apolitical, unbiased and free of misconduct (such as plagiarism, cheating, improper influence, and other improprieties). In this work, we study the problem of reviewers' recruitment, infringements of the double-blind process, fraudulent behaviors, biases in numerical ratings, and the appendix phenomenon (i.e., the fact that it is becoming more common to publish results in the appendix section of a paper). For each of these problems, we provide a short description and possible solutions. The goal of this work is to raise awareness in the Machine Learning community regarding these issues. © 2021, CC BY.

关键词： Machine learning

Generalized Zero-Shot Learning via Implicit Attribute Composition

学校读者我要写书评

暂无评论

Generalized Zero-Shot Learning via Implicit Attribute Compos...

IEEE International Conference on systems, Man and Cybernetics

作者： Lei Zhou Yang Liu Qiang Li School of Computer Science and Artificial Intelligence Wuhan University of Technology Wuhan China College of Computer Science and Technology Zhejiang University Hangzhou China State Key Laboratory for Management and Control of Complex Systems Institute of Automation Chinese Academy of Sciences Beijing China

Zero-shot learning (ZSL) is an important but challenging task in computer vision that aims to identify unseen classes without matching training samples. Current cutting-edge ZSL methods based on locality focus on acquiring the explicit locality of distinguishing characteristics, which could face a lack of adequate supervision at the class attribute level. This paper introduces a novel approach called IAC, which aims to learn Implicit Attribute Composition for ZSL. This method is more comprehensive compared to attribute localization that solely focuses on class-level attribute supervision. IAC utilizes subspace representations that efficiently capture the inherent structure of high-dimensional image features. Then, we learn implicit attribute composition through subspace representation learning. The superiority of the proposed IAC compared to the state-of-the-art is demonstrated through sufficient experiments conducted on three commonly used ZSL datasets, CUB, SUN, and AwA2.

关键词：

Federated Cubic Regularized Newton Learning with Sparsification-amplified Differential Privacy

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Huo, Wei Liu, Changxin Ding, Kemi Johansson, Karl Henrik Shi, Ling Department of Electronic and Computer Engineering Hong Kong University of Science and Technology Hong Kong Division of Decision and Control Systems School of Electrical Engineering and Computer Science KTH Royal Institute of Technology Digital Futures StockholmSE-10044 Sweden School of System Design and Intelligent Manufacturing Southern University of Science and Technology Shenzhen518055 China

This paper explores the cubic-regularized Newton method within a federated learning framework while addressing two major concerns: privacy leakage and communication bottlenecks. We propose the Differentially Private Federated Cubic Regularized Newton (DP-FCRN) algorithm, which leverages second-order techniques to achieve lower iteration complexity than first-order methods. We incorporate noise perturbation during local computations to ensure privacy. Furthermore, we employ sparsification in uplink transmission, which not only reduces the communication costs but also amplifies the privacy guarantee. Specifically, this approach reduces the necessary noise intensity without compromising privacy protection. We analyze the convergence properties of our algorithm and establish the privacy guarantee. Finally, we validate the effectiveness of the proposed algorithm through experiments on a benchmark dataset. © 2024, CC BY.

关键词： Federated learning

Signal temporal logic task decomposition via convex optimization

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Charitidou, Maria Dimarogonas, Dimos V. Division of Decision and Control Systems Royal Institute of Technology Stockholm100 44 Sweden

In this paper we focus on the problem of decomposing a global Signal Temporal Logic formula (STL) assigned to a multi-agent system to local STL tasks when the team of agents is a-priori decomposed to disjoint sub-teams. The predicate functions associated to the local tasks are parameterized as hypercubes depending on the states of the agents in a given sub-team. The parameters of the functions are, then, found as part of the solution of a convex program that aims implicitly at maximizing the volume of the zero level-set of the corresponding predicate function. Two alternative definitions of the local STL tasks are proposed and the satisfaction of the global STL formula is proven when the conjunction of the local STL tasks is satisfied. Copyright © 2021, The Authors. All rights reserved.

关键词： Convex optimization

Parallel Learning Based Foundation Model for Networked Traffic Signal control

学校读者我要写书评

暂无评论

Parallel Learning Based Foundation Model for Networked Traff...

International Conference on Intelligent Transportation

作者： Chen Zhao Xingyuan Dai Yuanyuan Chen Yilun Lin Yisheng Lv Fei-Yue Wang School of Artificial Intelligence University of Chinese Academy of Sciences Beijing China State Key Laboratory for Management and Control of Complex Systems Institute of Automation Chinese Academy of Sciences Beijing China Shanghai AI Laboratory Shanghai China Macao Institute of Systems Engineering Macau University of Science and Technology Macao China

Networked Traffic Signal control (NTSC) is a fundamental component of Intelligent Transportation systems (ITS) and the broader vision of smart city development. While a plethora of intelligent strategies have been developed, the Sim2Real challenge often impedes their full realization. In response, this paper introduces the Parallel Learning-based Adaptive Network for Traffic Signal control (PLANT) as a foundation model for NTSC. We employ the Wasserstein GAN with Gradient Penalty (WGAN-GP) to generate a wide range of artificial scenarios for robust PLANT training. Further, the Transformer-based Cooperation Mechanism (TCM) is integrated as the primary learner within PLANT, facilitating effective capture of traffic dynamics and knowledge accumulation. This knowledge is readily transferable to real-world applications through meticulous fine-tuning, equipping PLANT to adapt and evolve in alignment with shifting transportation paradigms. Our empirical study on the Hangzhou road network demonstrates PLANT's superiority over both traditional and emerging DRL-based approaches, emphasizing its viability as a potential foundation model for NTSC.

关键词：

Outlining the Landscape of Personalized Lung Cancer Treatment in the Era of Cyber-Physical systems

学校读者我要写书评

暂无评论

Outlining the Landscape of Personalized Lung Cancer Treatmen...

2021 International Conference on control, Automation and Diagnosis, ICCAD 2021

作者： Ghita, Maria Copot, Dana Verellen, Dirk Mihaela Ionescu, Clara Ghent University Research Group of Dynamical Systems and Control Ghent9052 Belgium EEDT Core Lab On Decision and Control Flanders Make Consortium Ghent9052 Belgium Cancer Research Institute Ghent Ghent9052 Belgium Iridium Cancer Network - GZA Hospitals Sint Augustinus Department of Radiation Oncology Wilrijk2610 Belgium Department of Radiotherapy Faculty of Medicine and Health Sciences Antwerp University Wilrijk2610 Belgium Department of Automatic Control Technical University of Cluj Napoca Cluj 400114 Romania

ISBN: (纸本)9781665449687

Lung cancer treatment management has always been at the interface of medicine, biology, and physics. Rapid progress is being made in the direction of new high-precision technology developments that emerge toward more patient- specific therapies with lower toxicity. This paper provides a complete generic framework of lung tumor targeting and patient individualization, thereby allowing a rational choice of tumor- specific dose and type of treatment. Possible comprehensive strategies including radiotherapy, immunotherapy, and targeted therapy drug (antiangiogenic), coupled with mathematical modeling of tumor dynamics and respiratory mechanics would bring considerable progress in the area of prediction models for treatment outcomes. The therapy becomes model-based predicted and leads to personalized radiotherapy with optimal doses, decreasing the risks of normal-tissue effects and costs as well. To maximize the success of the treatment in an individual, clinicians need to be aware of both tumor characteristics and lung tissue properties (assessed with a forced oscillation technique device). © 2021 IEEE.

关键词： Radiotherapy

DSTFormer: 3D Human Pose Estimation with a Dual-scale Spatial and Temporal Transformer Network

学校读者我要写书评

暂无评论

DSTFormer: 3D Human Pose Estimation with a Dual-scale Spatia...

International Conference on Advanced Robotics and Mechatronics (ICARM)

作者： Shaokun Zhang Xinde Li Chuanfei Hu Jianping Xu Huaping Liu School of Automation Southeast University Nanjing China Key Laboratory of Measurement and Control of Complex Systems of Engineering Ministry of Education Nanjing China Nanjing Center for Applied Mathematics Nanjing China Southeast University Shenzhen Research Institute Shenzhen China Science and Technology on Information Systems Engineering Laboratory Nanjing China Department of Computer Science and Technology Tsinghua University Beijing China

ISBN: (数字)9798350385724

ISBN: (纸本)9798350385731

Recent transformer-based methods for estimating 3D human pose have gained widespread attention, achieving state-of-the-art results. Previous methods have primarily focused on capturing motion patterns of the human body at a single scale or cascading multiple scales, such as joints, bones, and body-parts. However, they are difficult to simultaneously capture spatial-temporal motion patterns of the human body at different scales due to the complex motion patterns. To address this issue, we propose Dual-scale Spatial and Temporal transFormer (DSTFormer), which can concurrently explore the spatial dependencies and temporal motion patterns of human joints and bones. Additionally, we introduce a Gcn-Spatial Transformer Block (GSTB), which introduces Graph Convolutional Networks (GCN) into transformer to enhance the exploitation of local relationships and global information between adjacent joints or bones. Extensive experiments are conducted on the Human3.6M benchmark dataset, and superior results are reported when comparing to other state-of-the-art methods. More remarkably, our model achieves to-date the best published performance, with P1 errors of 37.9 mm and 15.6 mm, respectively.

关键词： Solid modeling Three-dimensional displays Mechatronics Graph convolutional networks Computational modeling Pose estimation Transformers Bones Joints Robots