检索结果-内蒙古大学图书馆

MKD-YOLO: Multi-Scale and Knowledge-Distilling YOLO for Efficient PPE Compliance Detection

学校读者我要写书评

暂无评论

MKD-YOLO: Multi-Scale and Knowledge-Distilling YOLO for Effi...

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Juntao Zan Yang Fang Qilie Liu Uswah Khairuddin Yan Li Kaiwei Sun School of Communications and Information Engineering Chongqing University of Posts and Telecommunications Chongqing China Chongqing Key Laboratory of Image Cognition Chongqing University of Posts and Telecommunications Chongqing China Malaysia-Japan International Institute of Technology University of Technology Malaysia Kuala Lumpur Malaysia Department of Electrical and Computer Engineering Inha University Incheon South Korea Key Laboratory of Data Engineering and Visual Computing Chongqing University of Posts and Telecommunications Chongqing China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

YOLO-based models are widely used for personal protective equipment (PPE) compliance detection due to their excellent detection performance and efficiency. However, most YOLO models are not competent for detection tasks in complex industrial scenarios such as remote surveillance and extremely small targets. In addition, there is a lack of effective model lightweighting and knowledge transfer approaches for industrial deployment. To this end, this paper proposes a Multi-scale and Knowledge-Distilling YOLO (MKD-YOLO) based on YOLOv8n for efficient PPE compliance detection. Specifically, in backbone stage, we design an Efficient Multi-Scale Enhanced Convolution (C2f-EMSEC) module and Large Spatial Pyramid Pooling-Fast (LSPPF) module for multi-scale and global-contextual feature learning as well as reducing model complexity. Then, in neck stage, a refined Bidirectional feature Pyramid Network (BPNet) is designated to capture fine-grained details for extremely small object detection. Moreover, we apply channel-wise knowledge distillation to facilitate model lightweighting and domain-specific knowledge transfer learning. Experiments on our proposed dataset and public datasets show that the proposed MKD-YOLO achieves a new state-of-the-art (SOTA) detection performance and efficiency for practical PPE compliance detection tasks. Codes and the dataset are available at https://***/z1Zjt/MKD-YOLO.

关键词： Personal protective equipment YOLO Representation learning Convolution Surveillance Speech enhancement Feature extraction Neck Kernel Knowledge transfer

Deploying Hybrid MAELM Approach for Human Emotion Detection Through Speech and Facial Expressions

学校读者我要写书评

暂无评论

Deploying Hybrid MAELM Approach for Human Emotion Detection ...

International Conference on Innovative Mechanisms for Industry Applications (ICIMIA)

作者： K Senthil Kumar S. Rukmani Devi Nidhi Ranjan Gitika Rath G. Indira Neerav Nishant Department of Networking and Communications School of Computing College of Engineering and Technology SRM Institute of Science and Technology Chennai India Department of Computer Science Saveetha College of Liberal Arts and Sciences SIMATS Deemed to be University Chennai India Computer Engineering Vasantdada Patil Pratishthan’s College of Engineering and Visual Arts Mumbai India Department of CSE Marri Laxman Reddy Institute of Technology and Management Hyderabad India Department of Electrical and Electronics Engineering Prince Shri Venkateshwara Padmavathy Engineering College Chennai India Department of Computer Science and Engineering School of Engineering Babu Banarasi Das University Lucknow India

Emotion recognition is one of the most fascinating emerging fields of research. It’s useful in a lot of different contexts. Some of the most exciting applications of this technology involve robots’ ability to see and communicate with one another. Human emotions can be recognized through both visual and audible cues. Facial expressions are one of the most accurate indicators of a person’s emotional state. Data preprocessing, feature extraction, and model training are the first three steps of the proposed methodology. Image resizing, a median filter, noise removal, and histogram equalization are all components of preprocessing. Gaussian mixture models and the gray level co-occurrence matrix are being used for feature extraction. After the extraction of features, models with those features are trained using MA-ELM. The proposed method outperforms PCA and ELM, two of the most popular alternatives.

关键词：

An Adaptive Column Compression Family for Self-Driving Databases

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Fehér, Marcell Lucani, Daniel E. Chatzigeorgiou, Ioannis Agile Cloud Lab Department of Electrical and Computer Engineering DIGIT Aarhus University Aarhus Denmark School of Computing and Communications Lancaster University Lancaster United Kingdom

Modern in-memory databases are typically used for high-performance workloads, therefore they have to be optimized for small memory footprint and high query speed at the same time. Data compression has the potential to reduce memory requirements but often reduces query speed too. In this paper we propose a novel, adaptive compressor that offers a new trade-off point of these dimensions, achieving better compression than LZ4 while reaching query speeds close to the fastest existing segment encoders. We evaluate our compressor both with synthetic data in isolation and on the TPC-H and Join Order Benchmarks, integrated into a modern relational column store, Hyrise. Copyright © 2022, The Authors. All rights reserved.

关键词： Economic and social effects

A Joint Communication and Computation Framework for Digital Twin over Wireless Networks

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Yang, Zhaohui Chen, Mingzhe Liu, Yuchen Zhang, Zhaoyang The College of Information Science and Electronic Engineering Zhejiang University Zhejiang Hangzhou310027 China Zhejiang Hangzhou310007 China Zhejiang Lab Zhejiang Hangzhou311100 China The Department of Electrical and Computer Engineering Institute for Data Science and Computing University of Miami Coral GablesFL33146 United States Department of Computer Science North Carolina State University RaleighNC27606 United States

In this paper, the problem of low-latency communication and computation resource allocation for digital twin (DT) over wireless networks is investigated. In the considered model, multiple physical devices in the physical network (PN) needs to frequently offload the computation task related data to the digital network twin (DNT), which is generated and controlled by the central server. Due to limited energy budget of the physical devices, both computation accuracy and wireless transmission power must be considered during the DT procedure. This joint communication and computation problem is formulated as an optimization problem whose goal is to minimize the overall transmission delay of the system under total PN energy and DNT model accuracy constraints. To solve this problem, an alternating algorithm with iteratively solving device scheduling, power control, and data offloading subproblems. For the device scheduling subproblem, the optimal solution is obtained in closed form through the dual method. For the special case with one physical device, the optimal number of transmission times is reveled. Based on the theoretical findings, the original problem is transformed into a simplified problem and the optimal device scheduling can be found. Numerical results verify that the proposed algorithm can reduce the transmission delay of the system by up to 51.2% compared to the conventional schemes. © 2024, CC BY-SA.

关键词： Budget control

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring

学校读者我要写书评

暂无评论

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothes...

Conference on computer Vision and Pattern Recognition (CVPR)

作者： Yuan Xu Xiaoxuan Ma Jiajun Su Wentao Zhu Yu Qiao Yizhou Wang Center on Frontiers of Computing Studies School of Computer Science Peking University International Digital Economy Academy (IDEA) School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Inst. for Artificial Intelligence Peking University Nat'1 Eng. Research Center of Visual Technology Nat'1 Key Lab of General Artificial Intelligence

ISBN: (数字)9798350353006

ISBN: (纸本)9798350353013

Monocular 3D human mesh estimation is an ill-posed problem, characterized by inherent ambiguity and occlusion. While recent probabilistic methods propose generating multiple solutions, little attention is paid to obtaining high-quality estimates from them. To address this limitation, we introduce ScoreHypo, a versatile framework by first leveraging our novel HypoNet to generate multiple hy-potheses, followed by employing a meticulously designed scorer, ScoreNet, to evaluate and select high-quality esti-mates. ScoreHypo formulates the estimation process as a re-verse denoising process, where HypoNet produces a diverse set of plausible estimates that effectively align with the im-age cues. Subsequently, ScoreNet is employed to rigorously evaluate and rank these estimates based on their quality and finally identify superior ones. Experimental results demon-strate that HypoNet outperforms existing state-of-the-art probabilistic methods as a multi-hypothesis mesh estimator. Moreover, the estimates selected by ScoreNet significantly outperform random generation or simple averaging. Notably, the trained ScoreNet exhibits generalizability, as it can effectively score existing methods and significantly reduce their errors by more than 15%. Code and models are available at ht tps: / /xy02- 05. gi thub. io/ScoreHypo.

关键词： computer vision Three-dimensional displays Accuracy Codes Noise reduction Estimation Benchmark testing

A Survey on Decentralized Identifiers and Verifiable Credentials

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Mazzocca, Carlo Acar, Abbas Uluagac, Selcuk Montanari, Rebecca Bellavista, Paolo Conti, Mauro Department of Information and Electrical Engineering and Applied Mathematics University of Salerno Fisciano84084 Italy Department of Computer Science and Engineering University of Bologna Bologna40136 Italy Cyber-Physical Systems Security Lab School of Computing and Information Science Florida International University MiamiFL33174 United States Department of Mathematics University of Padua Padua35131 Italy

Digital identity has always been one of the keystones for implementing secure and trustworthy communications among parties. The ever-evolving digital landscape has undergone numerous technological transformations that have profoundly reshaped digital identity management, leading to a major shift from centralized to decentralized identity models. The latest stage of this evolution is represented by the emerging paradigm of Self-Sovereign Identity (SSI), which gives identity owners full control over their data. SSI leverages Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs), which have been recently standardized by the World Wide Web Consortium (W3C). These technologies have the potential to build more secure and decentralized digital identity systems, significantly strengthening communication security in scenarios involving many distributed participants. It is worth noting that use of DIDs and VCs is not limited to individuals but extends to a wide range of entities including cloud, edge, and Internet of Things (IoT) resources. However, due to their novelty, existing literature lacks a comprehensive survey on DIDs and VCs beyond the scope of SSI. This paper fills this gap by providing a comprehensive overview of DIDs and VCs from multiple perspectives. It identifies key security threats and mitigation strategies, analyzes available implementations to guide practitioners in making informed decisions, and reviews the adoption of these technologies across various application domains. Moreover, it also examines related regulations, projects, and consortiums emerging worldwide. Finally, it discusses the primary challenges hindering their real-world adoption and outlines future research directions. © 2024, CC BY.

关键词： Internet of things

Learning Photometric Feature Transform for Free-form Object Scan

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Feng, Xiang Kang, Kaizhang Pei, Fan Ding, Huakeng You, Jinjiang Tan, Ping Zhou, Kun Wu, Hongzhi The State Key Lab of CAD & CG Zhejiang University Hangzhou310058 China Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology Hong Kong The Visual Computing Center King Abdullah University of Science and Technology Thuwal23955-6900 Saudi Arabia

We propose a novel framework to automatically learn to aggregate and transform photometric measurements from multiple unstructured views into spatially distinctive and view-invariant low-level features, which are subsequently fed to a multi-view stereo pipeline to enhance 3D reconstruction. The illumination conditions during acquisition and the feature transform are jointly trained on a large amount of synthetic data. We further build a system to reconstruct both the geometry and anisotropic reflectance of a variety of challenging objects from hand-held scans. The effectiveness of the system is demonstrated with a lightweight prototype, consisting of a camera and an array of LEDs, as well as an off-the-shelf tablet. Our results are validated against reconstructions from a professional 3D scanner and photographs, and compare favorably with state-of-the-art techniques. Copyright © 2023, The Authors. All rights reserved.

关键词： Stereo image processing

FOUNDATIONAL MODELS IN MEDICAL IMAGING: A COMPREHENSIVE SURVEY AND FUTURE VISION

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Azad, Bobby Azad, Reza Eskandari, Sania Bozorgpour, Afshin Kazerouni, Amirhossein Rekik, Islem Merhof, Dorit Electrical Engineering and Computer Science Department South Dakota State University Brookings United States Faculty of Electrical Engineering and Information Technology RWTH Aachen University Aachen Germany Department of Electrical Engineering University of Kentucky Lexington United States Faculty of Informatics and Data Science University of Regensburg Regensburg Germany BASIRA Lab Imperial-X and Computing Department Imperial College London London United Kingdom School of Electrical Engineering Iran University of Science and Technology Tehran Iran dorit.merhofur.de

Foundation models, large-scale, pre-trained deep-learning models adapted to a wide range of downstream tasks have gained significant interest lately in various deep-learning problems undergoing a paradigm shift with the rise of these models. Trained on large-scale dataset to bridge the gap between different modalities, foundation models facilitate contextual reasoning, generalization, and prompt capabilities at test time. The predictions of these models can be adjusted for new tasks by augmenting the model input with task-specific hints called prompts without requiring extensive labeled data and retraining. Capitalizing on the advances in computer vision, medical imaging has also marked a growing interest in these models. With the aim of assisting researchers in navigating this direction, this survey intends to provide a comprehensive overview of foundation models in the domain of medical imaging. Specifically, we initiate our exploration by providing an exposition of the fundamental concepts forming the basis of foundation models. Subsequently, we offer a methodical taxonomy of foundation models within the medical domain, proposing a classification system primarily structured around training strategies, while also incorporating additional facets such as application domains, imaging modalities, specific organs of interest, and the algorithms integral to these models. Furthermore, we emphasize the practical use case of some selected approaches and then discuss the opportunities, applications, and future directions of these large-scale pre-trained models, for analyzing medical images. In the same vein, we address the prevailing challenges and research pathways associated with foundational models in medical imaging. These encompass the areas of interpretability, data management, computational requirements, and the nuanced issue of contextual comprehension. Finally, we gather the over-viewed studies with their available open-source implementations at our GitHub. We aim

关键词： Medical applications

Distributed AI in Zero-touch Provisioning for Edge Networks: Challenges and Research Directions

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Hazra, Abhishek Morichetta, Andrea Murturi, Ilir Lovén, Lauri Dehury, Chinmaya Kumar Pujol, Víctor Casamayor Donta, Praveen Kumar Dustdar, Schahram Department of Computer Science and Engineering the Indian Institute of Information Technology Sri City India Communications & Networks Lab Department of Electrical and Computer Engineering National University of Singapore Singapore119260 Singapore Distributed Systems Group TU Wien Vienna1040 Austria Center for Ubiquitous Computing University of Oulu 90014 Finland Institute of Computer Science University of Tartu Tartu51009 Estonia

Zero-touch network is anticipated to inaugurate the generation of intelligent and highly flexible resource provisioning strategies where multiple service providers collaboratively offer computation and storage resources. This transformation presents substantial challenges to network administration and service providers regarding sustainability and scalability. This article combines Distributed Artificial Intelligence (DAI) with Zero-touch Provisioning (ZTP) for edge networks. This combination helps to manage network devices seamlessly and intelligently by minimizing human intervention. In addition, several advantages are also highlighted that come with incorporating Distributed AI into ZTP in the context of edge networks. Further, we draw potential research directions to foster novel studies in this field and overcome the current limitations. Copyright © 2023, The Authors. All rights reserved.

关键词： Internet of things