检索结果-内蒙古大学图书馆

2024 Systems of Signal Synchronization, Generating and processing in Telecommunications, SYNCHROINFO 2024

作者： Shevtsov, V.A. Kazantsev, A.M. Perlov, A. Yu. Matseevich, S.V. Moscow Aviation Institute Moscow Russia National Research University of Electronic Technologies Moscow Russia Lomonosov Moscow State University Faculty of Physics Moscow Russia

ISBN: (纸本)9798350373448

The article discusses an approach to decomposing a spatially distributed monitoring system (MS) into hierarchical levels. An original method for parameterizing the MS model for subsequent use in machine learning is presented. The possibility of using neural network technologies for data processing and the formation of a hierarchical dependence of parameters within the subsystems of the monitoring system, which allows solving the problem of designing its heterogeneous data transmission network based on specified requirements, has been demonstrated. The use of physical-informed machine learning (PIML) is proposed as a method for increasing the accuracy of the neural network by taking into account physical processes. A MS decomposition scheme based on physical equations is presented. © 2024 IEEE.

关键词： Data handling

来源：评论

学校读者我要写书评

暂无评论

Deep embedded lightweight CNN network for indoor objects detection on FPGA

引用

JOURNAL OF PARALLEL AND distributed COMPUTING 2025年 201卷

作者： Afif, Mouna Ayachi, Riadh Said, Yahia Atri, Mohamed Univ Monastir Fac Sci Monastir Lab Condensed Matter & Nanosci Monastir Tunisia Univ Monastir Fac Sci Monastir Monastir Tunisia Northern Border Univ Ctr Sci Res & Entrepreneurship Ar Ar 73213 Saudi Arabia King Khalid Univ Coll Comp Sci Abha Saudi Arabia

Indoor object detection and recognition present an active research axis in computer vision and artificial intelligence fields. Various deep learning-based techniques can be applied to solve object detection problems. With the appearance of deep convolutional neural networks (DCNN) a great breakthrough for various applications was achieved. Indoor object detection presents a primary task that can assist Blind and Visually Impaired persons (BVI) during their navigation. However, building a reliable indoor object detection system used for edge device implementations still presents a serious challenge. To address this problem, we propose in this work to build an indoor object detection system based on DCNN network. Cross-stage partial network (CSPNet) was used for the detection process and a lightweight backbone based on EfficientNet v2 was used as a network backbone. To ensure a lightweight implementation of the proposed work on FPGA devices, various optimization techniques have been applied to compress the model size and reduce its computation complexity. The proposed indoor object detection system was implemented on a Xilinx ZCU 102 board. Training and testing experiments have been conducted on the proposed indoor objects dataset that counts 11,000 images containing 25 landmark classes and in indoor objects detection dataset. The proposed work achieved 82.60 mAP and 28 FPS for the original version and 80.04 with 35 FPS as processing speed for the compressed version.

关键词： Indoor objects detection Blind and visually impaired Deep learning Embedded implementation VITIS AI Xilinx ZCU 102

来源：评论

学校读者我要写书评

暂无评论

Leveraging the capsule network to learn content text for collaborative filtering

引用

INTERNATIONAL JOURNAL OF COMMUNICATION networkS AND distributed SYSTEMS 2023年第5期29卷 555-572页

作者： Li, Ji Wang, Suhua ChangChun Vocat & Tech Coll Sch Informat Changchun 130117 Peoples R China Changchun Humanities & Sci Coll Changchun 130117 Peoples R China

At present, most of the building components, technologies and frameworks of deep learning are based on convolutional networks. However, some deep learning studies on image processing have shown that the capsule network can be more representational because it can capture various 'posture' changes, including translation, rotation and scaling, and can remember the position relationship between parts. Despite the intriguing nature of the capsule network and its potential to open up entirely new natural language processing architectures, little work has been done in this area. In this work, we use the capsule network to learn the content text of the item (such as the plot text of the movie or the description document of the product), to obtain a better representation of the item and help achieve a more accurate recommendation. We proposed 'leveraging the capsule network to learn content text for collaborative filtering (CCCF)'. This model combines the capsule network and neural matrix factorisation to effectively model text data and user-item ratings. Experiments conducted from different perspectives on two popular datasets show that CCCF achieves good performance in common recommendation tasks, which proves the effectiveness of the capsule network in recommendation.

关键词： capsule network content text generalised matrix factorisation collaborative filtering

来源：评论

学校读者我要写书评

暂无评论

AdaKnife: Flexible DNN Offloading for Inference Acceleration on Heterogeneous Mobile Devices

引用

IEEE TRANSACTIONS ON MOBILE COMPUTING 2025年第2期24卷 736-748页

作者： Liu, Sicong Luo, Hao Li, XiaoChen Li, Yao Guo, Bin Yu, Zhiwen Wang, YuZhan Ma, Ke Ding, YaSan Yao, Yuan Northwestern Polytech Univ Xian 710072 Shaanxi Peoples R China Harbin Engn Univ Harbin 150009 Peoples R China

The integration of deep neural network (DNN) intelligence into embedded mobile devices is expanding rapidly, supporting a wide range of applications. DNN compression techniques, which adapt models to resource-constrained mobile environments, often force a trade-off between efficiency and accuracy. distributed DNN inference, leveraging multiple mobile devices, emerges as a promising alternative to enhance inference efficiency without compromising accuracy. However, effectively decoupling DNN models into fine-grained components for optimal parallel acceleration presents significant challenges. Current partitioning methods, including layer-level and operator or channel-level partitioning, provide only partial solutions and struggle with the heterogeneous nature of DNN compilation frameworks, complicating direct model offloading. In response, we introduce AdaKnife, an adaptive framework for accelerated inference across heterogeneous mobile devices. AdaKnife enables on-demand mixed-granularity DNN partitioning via computational graph analysis, facilitates efficient cross-framework model transitions with operator optimization for offloading, and improves the feasibility of parallel partitioning using a greedy operator parallelism algorithm. Our empirical studies show that AdaKnife achieves a 66.5% reduction in latency compared to baselines.

关键词： Computational modeling Artificial neural networks Adaptation models Mobile handsets Parallel processing Computational efficiency Load modeling Analytical models Optimization Performance evaluation DNN Offloading DNN partition heterogeneous mobile devices

来源：评论

学校读者我要写书评

暂无评论

A Novel Domain Adversarial networks Based on 3D-LSTM and Local Domain Discriminator for Hearing-Impaired Emotion Recognition

引用

IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS 2023年第1期27卷 363-373页

作者： Tian, Zekun Li, Dahua Yang, Yi Hou, Fazheng Yang, Zhiyi Song, Yu Gao, Qiang Tianjin Univ Technol Sch Elect Engn & Automat Tianjin Key Lab Control Theory & Applicat Complica Tianjin 300384 Peoples R China Tianjin Univ Technol Maritime Coll Tianjin Key Lab Control Theory & Applicat Complica Tianjin 300384 Peoples R China

Recent research on emotion recognition suggests that deep network-based adversarial learning has an ability to solve the cross-subject problem of emotion recognition. This study constructed a hearing-impaired electroencephalography (EEG) emotion dataset containing three emotions (positive, neutral, and negative) in 15 subjects. The emotional domain adversarial neural network (EDANN) was carried out to identify hearing-impaired subjects' emotions by learning hidden emotion information between the labeled data and the data with no-label. For the input data, we propose a spatial filter matrix to reduce the overfitting of the training data. A feature extraction network 3DLSTM-ConvNET was used to extract comprehensive emotional information from the time, frequency, and spatial dimensions. Moreover, emotion local domain discriminator and emotion film group local domain discriminator were added to reduce the distribution distance between the same kinds of emotions and different film groups, respectively. According to the experimental results, the average accuracy of subject-dependent is 0.984 (STD: 0.011), and that of subject-independent is 0.679 (STD: 0.140). In addition, by analyzing the discrimination characteristics, we found that the brain regions with emotional recognition in the hearing-impaired are distributed in the wider areas of the parietal and occipital lobes, which may be caused by visual processing.

关键词： Electroencephalogram (EEG) emotion recognition hearing-impaired subjects domain adaptation neural network (DANN)

来源：评论

学校读者我要写书评

暂无评论

A Comparison of Spectral and Spatial Graph Convolutional neural network Kernels Using GraphSAGE-Sparse

A Comparison of Spectral and Spatial Graph Convolutional Neu...

引用

37th IEEE International Parallel and distributed processing Symposium (IPDPS)

作者： Eydenberg, Michael Plagge, Mark Rajamanickam, Siva Sandia Natl Labs Albuquerque NM 87123 USA

ISBN: (纸本)9798350311990

Graph Convolutional networks (GCNs) are widely successful architectures for performing deep learning on graphs, but their well-known scalability challenges have led to increased interest to develop both improved algorithms and hardware accelerators. In this paper, we present and evaluate GraphSAGE-Sparse, a variant of the paradigmatic GraphSAGE GCN that replaces the original's spatial-based node convolution operation with a minibatch-aware sparse matrix multiply (SpMM) kernel. We find that this modification substantially reduces the per-batch memory cost for training and inference on a GPU accelerator, with the tradeoff of increased time and memory needed to preprocess the data structures used by the sparse kernel. On comparing both algorithms with datasets from the Open Graph Benchmark, we find that GraphSAGE-Sparse is able to obtain improved accuracy predictions in less than half of the total training time, even with the additional preprocessing work.

关键词： graph sampling graph neural networks (GNNs) graph convolutional networks (GCNs) graph embeddings

来源：评论

学校读者我要写书评

暂无评论

Novel approach for forest road maintenance using smartphone sensor data and deep learning methods

引用

INTERNATIONAL JOURNAL OF FOREST ENGINEERING 2024年第3期35卷 507-514页

作者： Heidari, Mohammad Javad Najafi, Akbar Borges, Jose G. Lagoa, Constantino Tarbiat Modares Univ Fac Nat Resources & Marine Sci Tehran *** Iran Univ Lisbon Forest Res Ctr Sch Agr Associate Lab TERRA Lisbon Portugal Penn State Univ Elect Engn Dept University Pk PA USA

High costs primarily pose challenges to forest management in planning and executing the repair of forest roads. With budget limitations and inadequate oversight, it has become critically essential to monitor the state of these roads. Monitoring the condition of forest roads has become imperative, driven by budget constraints and a lack of effective supervision. While smartphones have proven effective in detecting road defects on public roads, their application on forest roads is hindered by the absence of suitable indices and software infrastructure. Addressing this gap, this research focuses on the development of the Forest Road Pavement Condition Index (FRPCI) to facilitate smartphone-based monitoring. We collected and compared data from 4 kilometers of forest roads, employing two traditional harvesting methods alongside smartphone sensor data. Utilizing deep learning methods, including Convolutional neural network (CNN), Long-Short Term Memory (LSTM), and CNN-LSTM, we processed the collected data. Signal processing using GPS data, coupled with wavelet transformation, demonstrated promising results with an accuracy and recall exceeding 80%. The proposed system functions as a distributed information system, transitioning data from organizational mode to field mode. It measures damage, assesses forest road conditions, and leverages image processing and GPS technologies. This monitoring system technology offers capabilities for preparing, storing, updating, maintaining, and analyzing diverse information. Importantly, adopting this method can significantly reduce operating costs, making forest road monitoring for maintenance purposes more feasible.

关键词： Deterioration road distress artificial intelligence neural network CNN-LSTM

来源：评论

学校读者我要写书评

暂无评论

Tensorox: Accelerating GPU Applications via neural Approximation on Unused Tensor Cores

引用

IEEE TRANSACTIONS ON PARALLEL AND distributed SYSTEMS 2022年第2期33卷 429-443页

作者： Ho, Nhut-Minh Wong, Weng-Fai Natl Univ Singapore Dept Comp Sci Singapore 119077 Singapore

Driven by the demands of deep learning, many hardware accelerators, including GPUs, have begun to include specialized tensor processing units to accelerate matrix operations. However, general-purpose GPU applications that have little or no large dense matrix operations cannot benefit from these tensor units. This article proposes Tensorox, a framework that exploits the half-precision tensor cores available on recent GPUs for approximable, non deep learning applications. In essence, a shallow neural network is trained based on the input-output mapping of the function to be approximated. The key innovation in our implementation is the use of the small and dimension-restricted tensor operations in Nvidia GPUs to run multiple instances of the approximation neural network in parallel. With the proper scaling and training methods, our approximation yielded an overall accuracy that is higher than naively running the original programs with half-precision. Furthermore, Tensorox allows for the runtime adjustment of the degree of approximation. For the 10 benchmarks we tested, we achieved speedups from 2x to 112x compared to the original in single precision floating point, while maintaining the error caused by the approximation to below 10 percent in most applications.

关键词： Hardware Tensors neural networks Deep learning Graphics processing units Task analysis Training Graphics processing units parallel programming approximate computing neural networks tensor processing unit GPGPU

来源：评论

学校读者我要写书评

暂无评论

Delayed Algorithms for distributed Stochastic Weakly Convex Optimization 37

Delayed Algorithms for Distributed Stochastic Weakly Convex ...

引用

37th Conference on neural Information processing Systems (NeurIPS)

作者： Gao, Wenzhi Deng, Qi Stanford Univ Stanford CA 94305 USA Shanghai Univ Finance & Econ Shanghai Peoples R China SHUFE Shanghai Peoples R China

ISBN: (纸本)9781713899921

This paper studies delayed stochastic algorithms for weakly convex optimization in a distributed network with workers connected to a master node. Recently, Xu et al. 2022 showed that an inertial stochastic subgradient method converges at a rate of O(tau(max)/root K) which depends on the maximum information delay tau(max). In this work, we show that the delayed stochastic subgradient method (DSGD) obtains a tighter convergence rate which depends on the expected delay (tau) over bar. Furthermore, for an important class of composition weakly convex problems, we develop a new delayed stochastic prox-linear (DSPL) method in which the delays only affect the high-order term in the complexity rate and hence, are negligible after a certain number of DSPL iterations. In addition, we demonstrate the robustness of our proposed algorithms against arbitrary delays. By incorporating a simple safeguarding step in both methods, we achieve convergence rates that depend solely on the number of workers, eliminating the effect of the delay. Our numerical experiments further confirm the empirical superiority of our proposed methods.

关键词： Convex optimization

来源：评论

学校读者我要写书评

暂无评论

Athena: Add More Intelligence to RMT-Based network Data Plane with Low-Bit Quantization 30th

Athena: Add More Intelligence to RMT-Based Network Data Plan...

引用

30th European Conference on Parallel and distributed processing (Euro-Par)

作者： Liao, Yunkun Lin, Hanyue Wu, Jingya Lu, Wenyan Li, Huawei Li, Xiaowei Yan, Guihai Chinese Acad Sci State Key Lab Processors Inst Comp Technol Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Zhongguancun Lab Beijing Peoples R China YUSUR Tech Co Ltd Beijing Peoples R China

ISBN: (纸本)9783031697654;9783031697661

Performing per-packet neural network (NN) inference on the network data plane is a promising approach for accurate and fast decision-making in computer network. However, data plane architecture like the Reconfigurable Match Tables (RMT) pipeline has limited support for NN computation. Previous efforts have utilized the Binary Neuron network (BNN) as a compromise, but the accuracy loss of BNN is high. Inspired by the accuracy gain of the low-bit (2-bit and 4-bit) models, this paper proposes Athena. Athena can deploy the sparse low-bit quantization models on RMT. Compared with the BNN-based state-of-the-art, Athena achieves new Pareto frontier regarding model accuracy and inference latency.

关键词： neural network RMT Pipeline Quantization Pruning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：