检索结果-内蒙古大学图书馆

Informatica (Slovenia) 2025年第5期49卷 37-48页

作者： Remmach, Hassnae Razak, Siti Fatimah Abdul Ullah, Arif Yogarayan, Sumendra Sayeed, Md Shohel Mrhari, Amine Computer Systems Engineering Laboratory Cadi Ayyad University Marrakesh40000 Morocco Faculty of Information Science and Technology Multimedia University Ayer Keroh75450 Malaysia Research in Computer Science Laboratory Faculty of Sciences Ibn Tofail University Kenitra Morocco

In a number of industries, including computer graphics, robotics, and medical imaging, three-dimensional reconstruction is essential. In this research, a CNN-based Multi-output and Multi-Task Regressor with deep learning capabilities is proposed for three-dimensional object reconstruction from 3D point cloud. Our approach is grounded in the original Point Net architecture, which addresses the difficulties associated with convolution when applied to point clouds. Firstly, this paper is modified using a Multi-Output Regressor to accurately recreate Super forms from 3D point clouds. Using this method, we first extract features from the 3D point cloud using Point Net. After that, a Multi-Output Regressor receives these data and uses them to anticipate the Super shape parameters needed to reconstruct the shape. Taking in the data, the Multi-Output Regressor retrieved characteristics from Point Net and simultaneously predicts several outcomes. Second, a Multi-Task Regressor is used to modify the Point Net. The network gains from the capacity to transfer knowledge from one task to another, improving the model's overall performance. The model would forecast the ten parameters needed to create the shape in the case of rebuilding Super shapes. The test findings were better than expected;they are intriguing in terms of prediction accuracy and cost, and they update the result by 80%, which is a good accomplishment for the study. © 2025 Slovene Society Informatika. All rights reserved.

关键词： 3D reconstruction

来源：评论

学校读者我要写书评

暂无评论

Multi-task learning andjoint refinement between camera localization and object detection

引用

Computational Visual Media 2024年第5期10卷 993-1011页

作者： Junyi Wang Yue Qi State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and EngineeringBeihang UniversityBeijing 100191China Peng Cheng Laboratory Shenzhen 518052China Qingdao Research Institute of Beihang University Qingdao 266104China School of Computer Science and Technology Shandong UniversityQingdaoChina

Visual localization and object detection both play important roles in various *** many indoor application scenarios where some detected objects have fixed positions,the two techniques work closely ***,few researchers consider these two tasks simultaneously,because of a lack of datasets and the little attention paid to such *** this paper,we explore multi-task network design and joint refinement of detection and *** address the dataset problem,we construct a medium indoor scene of an aviation exhibition hall through a semi-automatic *** dataset provides localization and detection information,and is publicly available at https://***/drive/folders/1U28zk0N4_I0db zkqyIAK1A15k9oUKOjI?usp=sharing for benchmarking localization and object detection *** this dataset,we have designed a multi-task network,JLDNet,based on YOLO v3,that outputs a target point cloud and object bounding *** dynamic environments,the detection branch also promotes the perception of *** includes image feature learning,point feature learning,feature fusion,detection construction,and point cloud ***,object-level bundle adjustment is used to further improve localization and detection *** test JLDNet and compare it to other methods,we have conducted experiments on 7 static scenes,our constructed dataset,and the dynamic TUM RGB-D and Bonn *** results show state-of-the-art accuracy for both tasks,and the benefit of jointly working on both tasks is demonstrated.

关键词： visual localization object detection joint optimization multi-task learning

来源：评论

学校读者我要写书评

暂无评论

ViGT: proposal-free video grounding with a learnable token in the transformer

引用

science China(Information sciences) 2023年第10期66卷 196-212页

作者： Kun LI Dan GUO Meng WANG School of Computer Science and Information Engineering Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Intelligent Interconnected Systems Laboratory of Anhui Province Institute of Artificial Intelligence Hefei Comprehensive National Science Center

The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely video grounding transformer(ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows.(1) The token is unrelated to the video or the query and avoids data bias toward the original video and query.(2) The token simultaneously performs global context aggregation from video and query ***, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention(i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets:ANet-Captions, TACoS, and YouCookⅡ. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT.

关键词： video grounding temporal sentence grounding boundary regression token learning proposal-free

来源：评论

学校读者我要写书评

暂无评论

Learning cross-modal interaction for RGB-T tracking

引用

science China(Information sciences) 2023年第1期66卷 320-321页

作者： Chunyan XU Zhen CUI Chaoqun WANG Chuanwei ZHOU Jian YANG Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education School of Computer Science and Engineering Nanjing University of Science and Technology

Dear editor,Visual object tracking, which has attracted increasing attention in the field of general visual understanding, aims to track each temporally changing object in a video sequence, with the target specified only in the first *** most tracking algorithms have facilitated significant advances in RGB video sequences, object tracking using only RGB information is unreliable under extreme lighting conditions(e.g., dark night, rain, and foggy).

关键词：

来源：评论

学校读者我要写书评

暂无评论

UKF‐MOT:An unscented Kalman filter‐based 3D multi‐object tracker

引用

CAAI Transactions on Intelligence Technology 2024年第4期9卷 1031-1041页

作者： Meng Liu Jianwei Niu Yu Liu Collective Intelligence&Collaboration Laboratory China North Artificial Intelligence and Innovation Research InstituteBeijingChina China North Vehicle Research Institute BeijingChina State Key Laboratory of Software Development Environment School of Computer Science and EngineeringBeihang UniversityBeijingChina State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and EngineeringBeihang UniversityBeijingChina School of Computer Science and Engineering Beihang UniversityBeijingChina

Multi‐object tracking in autonomous driving is a non‐linear *** better address the tracking problem,this paper leveraged an unscented Kalman filter to predict the object's *** the association stage,the Mahalanobis distance was employed as an affinity metric,and a Non‐minimum Suppression method was designed for *** the detections fed into the tracker and continuous‘predicting‐matching’steps,the states of each object at different time steps were described as their own continuous *** conducted extensive experiments to evaluate tracking accuracy on three challenging datasets(KITTI,nuScenes and Waymo).The experimental results demon-strated that our method effectively achieved multi‐object tracking with satisfactory ac-curacy and real‐time efficiency.

关键词： autonomous vehicle transportation

来源：评论

学校读者我要写书评

暂无评论

Angle robust transmitted plasmonic colors with different surroundings utilizing localized surface plasmon resonance

引用

Chinese Physics B 2023年第7期32卷 226-233页

作者：高旭峰王琦张世杰洪瑞金张大伟 Shanghai Key Laboratory of Modern Optic Systems Engineering Research Center of Optical Instrument and SystemMinistry of Education and Shanghai Key Laboratory of Modern Optical SystemsSchool of Optical-Electrical and Computer EngineeringUniversity of Shanghai for Science and TechnologyShanghai 200093China

Color filters in different surroundings inherently suffer from angular sensitivity,which hinders their practical ***,we present an angle-insensitive plasmonic filter that can produce different color responses to different surrounding *** color filters are based on a two-dimensional periodically and randomly distributed silver nanodisk array on a silica *** proposed plasmonic color filters not only produce bright colors by altering the diameter of the Ag nanodisk,but also achieve continuous color palettes by changing the surrounding *** to the weak coupling between the metallic nanodisks,the plasmonic color filters can enable good incident angle-insensitive properties(up to 30°).The strategy presented here could exhibit robust and promising applicability in anti-counterfeiting and imaging technologies.

关键词： plasmonic color filter color sensing high angular tolerance

来源：评论

学校读者我要写书评

暂无评论

An Intelligent Privacy Protection Scheme for Efficient Edge Computation Offloading in IoV

引用

Chinese Journal of Electronics 2024年第4期33卷 910-919页

作者： Liang YAO Xiaolong XU Wanchun DOU Muhammad Bilal School of Software Nanjing University of Information Science and Technology State Key Laboratory for Novel Software Technology Nanjing University Department of Computer and Electronics Systems Engineering Hankuk University of Foreign Studies

As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive,and privacy-aware vehicular applications in Io V result in the transformation from cloud computing to edge computing,which enables tasks to be offloaded to edge nodes(ENs) closer to vehicles for efficient execution. In ITS environment,however, due to dynamic and stochastic computation offloading requests, it is challenging to efficiently orchestrate offloading decisions for application requirements. How to accomplish complex computation offloading of vehicles while ensuring data privacy remains challenging. In this paper, we propose an intelligent computation offloading with privacy protection scheme, named COPP. In particular, an Advanced Encryption Standard-based encryption method is utilized to implement privacy protection. Furthermore, an online offloading scheme is proposed to find optimal offloading policies. Finally, experimental results demonstrate that COPP significantly outperforms benchmark schemes in the performance of both delay and energy consumption.

关键词： Industries Privacy Energy consumption Transportation Computational efficiency Encryption Protection

来源：评论

学校读者我要写书评

暂无评论

Towards Green AI by Reducing Training Effort of Recurrent Neural Networks Using Hyper-Parameter Optimization with Dynamic Stopping Criteria 22

Towards Green AI by Reducing Training Effort of Recurrent Ne...

引用

22nd IEEE International Symposium on Intelligent systems and Informatics, SISY 2024

作者： Podgorelec, Vili Fister, Iztok Vrbancic, Grega University of Maribor Intelligent Systems Laboratory Faculty of Electrical Engineering and Computer Science Slovenia

ISBN: (纸本)9798350385601

Neural networks have become a leading model in modern machine learning, able to model even the most complex data. For them to be properly trained, however, a lot of computational resources are required. With the carbon footprint of ever-growing adoption of neural networks in mind, an approach to reduce the required training resources would be very welcome. We designed a new training effort reduction method based on the calculation of area under the normalized loss curve and assessed it on the electricity consumption forecasting problem with the recurrent neural networks. The results show that the proposed method was able to considerably reduce the amount of computational resources, while maintaining the predictive performance, and thus contributing towards the Green AI. © 2024 IEEE.

关键词： Green AI hyper-parameter optimization machine learning neural networks resource optimization

来源：评论

学校读者我要写书评

暂无评论

A survey for light field super-resolution

引用

High-Confidence Computing 2024年第1期4卷 118-129页

作者： Mingyuan Zhao Hao Sheng Da Yang Sizhe Wang Ruixuan Cong Zhenglong Cui Rongshan Chen Tun Wang Shuai Wang Yang Huang Jiahao Shen State Key Laboratory of Virtual Reality Technology and Systems School of Computer Science and EngineeringBeihang UniversityBeijing 100191China Key Laboratory of Data Science and Intelligent Computing International Innovation InstituteBeihang UniversityHangzhou 311115China State Key Laboratory of Software Development Environment School of Computer Science and EngineeringBeihang UniversityBeijing 100191China

Compared to 2D imaging data,the 4D light field(LF)data retains richer scene’s structure information,which can significantly improve the computer’s perception capability,including depth estimation,semantic segmentation,and LF ***,there is a contradiction between spatial and angular resolution during the LF image acquisition *** overcome the above problem,researchers have gradually focused on the light field super-resolution(LFSR).In the traditional solutions,researchers achieved the LFSR based on various optimization frameworks,such as Bayesian and Gaussian *** learning-based methods are more popular than conventional methods because they have better performance and more robust generalization *** this paper,the present approach can mainly divided into conventional methods and deep learning-based *** discuss these two branches in light field spatial super-resolution(LFSSR),light field angular super-resolution(LFASR),and light field spatial and angular super-resolution(LFSASR),***,this paper also introduces the primary public datasets and analyzes the performance of the prevalent approaches on these ***,we discuss the potential innovations of the LFSR to propose the progress of our research field.

关键词： Light field super-resolution Convolutional neural network Transformer Sub-aperture image Epipolar-plane image

来源：评论

学校读者我要写书评

暂无评论

Mixed-decomposed convolutional network:A lightweight yet efficient convolutional neural network for ocular disease recognition

引用

CAAI Transactions on Intelligence Technology 2024年第2期9卷 319-332页

作者： Xiaoqing Zhang Xiao Wu Zunjie Xiao Lingxi Hu Zhongxi Qiu Qingyang Sun Risa Higashita Jiang Liu Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Engineering Southern University of Science and TechnologyShenzhenChina Tomey Corporation NagoyaJapan Guangdong Provincial Key Laboratory of Brain‐inspired Intelligent Computation Department of Computer Science and EngineeringSouthern University of Science and TechnologyShenzhenChina Singapore Eye Research Institute SingaporeSingapore

Eye health has become a global health concern and attracted broad *** the years,researchers have proposed many state-of-the-art convolutional neural networks(CNNs)to assist ophthalmologists in diagnosing ocular diseases efficiently and ***,most existing methods were dedicated to constructing sophisticated CNNs,inevitably ignoring the trade-off between performance and model *** alleviate this paradox,this paper proposes a lightweight yet efficient network architecture,mixeddecomposed convolutional network(MDNet),to recognise ocular *** MDNet,we introduce a novel mixed-decomposed depthwise convolution method,which takes advantage of depthwise convolution and depthwise dilated convolution operations to capture low-resolution and high-resolution patterns by using fewer computations and fewer *** conduct extensive experiments on the clinical anterior segment optical coherence tomography(AS-OCT),LAG,University of California San Diego,and CIFAR-100 *** results show our MDNet achieves a better trade-off between the performance and model complexity than efficient CNNs including MobileNets and ***,our MDNet outperforms MobileNets by 2.5%of accuracy by using 22%fewer parameters and 30%fewer computations on the AS-OCT dataset.

关键词： artificial intelligence deep learning deep neural networks image analysis image classification medical applications medical image processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：