检索结果-内蒙古大学图书馆

18th International conference on Distributed and Event-Based Systems (DEBS)

作者： Gupta, Harshit Saurez, Enrique Roeger, Henriette Bhowmik, Sukanya Ramachandran, Umakishore Rothermel, Kurt Georgia Inst Technol Atlanta GA 30332 USA Univ Stuttgart Stuttgart Germany Univ Potsdam Potsdam Germany Microsoft Corp Redmond WA 98052 USA

ISBN: (纸本)9798400704437

real-time video analytics typically require video frames to be processed by a query to identify objects or activities of interest while adhering to an end-to-end frame processing latency constraint. This imposes a continuous and heavy load on backend compute and network infrastructure. video data, has inherent redundancy and does not always contain an object of interest for a given query. We leverage this property of video streams to propose a lightweight Load Shedder that can be deployed on edge servers or on inexpensive edge devices co-located with cameras. The proposed Load Shedder uses pixel-level color-based features to calculate a utility score for each ingress video frame and a minimum utility threshold to select interesting frames to send for query processing. Dropping unnecessary frames enables the video analytics query in the backend to meet the end-to-end latency constraint with fewer compute and network resources. To guarantee a bounded end-to-end latency at runtime, we introduce a control loop that monitors the backend load and dynamically adjusts the utility threshold. Performance evaluations show that the proposed Load Shedder selects a large portion of frames containing each object of interest while meeting the end-to-end frame processing latency constraint. Furthermore, it does not impose a significant latency overhead when running on edge devices with modest compute resources.

关键词： video Analytics Load Shedding latency bound QoS

来源：评论

学校读者我要写书评

暂无评论

In-Loop Filtering via Trained Look-Up Tables

In-Loop Filtering via Trained Look-Up Tables

引用

2024 conference on Visual Communications and image processing

作者： Li, Zhuoyuan Li, Jiacheng Li, Yao Li, Li Liu, Dong Wu, Feng Univ Sci & Technol China USTC Hefei 230027 Anhui Peoples R China

ISBN: (纸本)9798331529543;9798331529550

In-loop filtering (ILF) is a key technology in image/video coding for reducing the artifacts. Recently, neural network-based in-loop filtering methods achieve remarkable coding gains beyond the capability of advanced video coding standards, establishing themselves a promising candidate tool for future standards. However, the utilization of deep neural networks (DNN) brings high computational complexity and raises high demand of dedicated hardware, which is challenging to apply into general use. To address this limitation, we study an efficient in-loop filtering scheme by adopting look-up tables (LUTs). After training a DNN with a predefined reference range for in-loop filtering, we cache the output values of the DNN into a LUT via traversing all possible inputs. In the coding process, the filtered pixel is generated by locating the input pixels (to-be-filtered pixel and reference pixels) and interpolating between the cached values. To further enable larger reference range within the limited LUT storage, we introduce an enhanced indexing mechanism in the filtering process, and a clipping/finetuning mechanism in the training. The proposed method is implemented into the Versatile video Coding (VVC) reference software, VTM-11.0. Experimental results show that the proposed method, with three different configurations, achieves on average 0.13%similar to 0.51%, and 0.10%similar to 0.39% BD-rate reduction under the all-intra (AI) and random-access (RA) configurations respectively. The proposed method incurs only 1%similar to 8% time increase, an additional computation of 0.13 similar to 0.93 kMAC/pixel, and 164 similar to 1148 KB storage cost for a single model. Our method has explored a new and more practical approach for neural network-based ILF.

关键词： In-loop filtering deep neural network Look-up Table (LUT) video coding VVC

来源：评论

学校读者我要写书评

暂无评论

DiffRank: Enhancing efficiency in discontinuous frame rate analysis for urban surveillance systems

引用

ALEXANDRIA ENGINEERING JOURNAL 2024年 103卷 148-157页

作者： Cheng, Ziying Li, Zhe Zhang, Tianfan Zhao, Xiaochao Jing, Xiao Hubei Engn Univ Sch Comp & Informat Sci Xiaogan 432100 Peoples R China Hubei Engn Univ Sch Math & Stat Xiaogan 432100 Peoples R China Northwestern Polytech Univ Sch Cybersecur Xian 710000 Peoples R China

Urban public safety management relies heavily on video surveillance systems, which provide crucial visual data for resolving a wide range of incidents and controlling unlawful activities. Traditional methods for target detection predominantly employ a two-stage approach, focusing on precision in identifying objects such as pedestrians and vehicles. These objects, typically sparse in large-scale, lower-quality surveillance footage, induce considerable redundant computation during the initial processing stage. This redundancy constrains real-time detection capabilities and escalates processing costs. Furthermore, transmitting raw images and videos laden with superfluous information to centralized back-end systems significantly burdens network communications and fails to capitalize on the computational resources available at diverse surveillance nodes. This study introduces DiffRank, a novel preprocessing method for fixed-angle video imagery in urban surveillance. The method strategically generates candidate regions during preprocessing, thereby reducing redundant object detection and improving the efficiency of the detection algorithm. Drawing upon change detection principles, a background feature learning approach utilizing shallow features has been developed. This approach prioritizes learning the characteristics of fixed-area backgrounds over direct background identification. As a result, alterations in ROI are efficiently discerned using computationally efficient shallow features, markedly accelerating the generation of proposed Regions of Interest (ROIs) and diminishing the computational demands for subsequent object detection and classification. Comparative analysis on various public and private datasets illustrates that DiffRank, while maintaining high accuracy, substantially outperforms existing baselines in terms of speed, particularly with larger image sizes (e.g., an improvement exceeding 300 % at 1920 x1080 resolution). Moreover, the method demonstrates en

关键词： video image processing Portrait detection Public security management Two-stage algorithm Change detection

来源：评论

学校读者我要写书评

暂无评论

IMPROVING real-time NEAR-INFRARED FACE ALIGNMENT WITH A PAIRED VIS-NIR DATASET AND DATA AUGMENTATION THROUGH image-TO-image TRANSLATION 31

IMPROVING REAL-TIME NEAR-INFRARED FACE ALIGNMENT WITH A PAIR...

引用

2024 International conference on image processing

作者： Miao, Langning Kakimoto, Ryo Ohishi, Kaoru Watanabe, Yoshihiro Tokyo Inst Technol Dept Informat & Commun Engn Tokyo Japan KOSE Corp Res Labs Tokyo Japan

ISBN: (纸本)9798350349405;9798350349399

real-time near-infrared (NIR) face alignment holds significant importance across various domains, such as security, healthcare, and augmented reality. However, existing face alignment techniques tailored for visible-light (VIS) encounter a decline in accuracy when applied in NIR settings. This decline stems from the domain discrepancy between VIS and NIR facial domains and the absence of meticulously annotated NIR facial data. To address this issue, we introduce a system and strategy for gathering paired VIS-NIR facial images and meticulously annotating precise landmarks. Our system facilitates streamlined dataset preparation by utilizing automatic annotation transfer from VIS images to their corresponding NIR counterparts. Following our devised approach, we constructed an inaugural dataset comprising high-frame-rate paired VIS-NIR facial images with landmark annotations. Additionally, to enhance the diversity of facial data, we augment our dataset through VIS-NIR image-to-image (img2img) translation using publicly available facial landmark datasets. Through the retraining of face alignment models and subsequent evaluations, our findings demonstrate a noteworthy enhancement in the accuracy of face alignment under NIR conditions using our dataset. Furthermore, the augmented dataset exhibits refined accuracy, particularly notable in the case of different individuals' facial features.

关键词： Face alignment Near-infrared facial dataset image-to-image translation

来源：评论

学校读者我要写书评

暂无评论

Three-terminal quantum dot light-emitting synapse with active adaptive photoelectric outputs for complex image processing/parallel computing

引用

MATTER 2024年第11期7卷 3891-3906页

作者： Chen, Cong Chen, Zhenjia Liu, Di Zhang, Xianghong Gao, Changsong Shan, Liuting Liu, Lujian Chen, Tianjian Guo, Tailiang Chen, Huipeng Fuzhou Univ Natl & Local United Engn Lab Flat Panel Display Te Inst Optoelect Display Fuzhou 350002 Peoples R China Fujian Sci & Technol Innovat Lab Optoelect Informa Fuzhou 350100 Peoples R China

Machine vision enables machines to extract rich information from image or video data and make intelligent decisions. However, approaches using artificial synapse hardware systems significantly limit the real-time and accuracy in machine vision segmentation amid complex environments. Addressing this, we propose a novel three-terminal adaptive artificial-light-emitting synapse (AALS) capable of photoelectric double output along with adaptive behavior. The device uses silver nanowires (AgNWs) as polar conductive bridges to reduce reliance on transparent electrodes, while polyvinyl alcohol (PVA) dielectric layers adaptively modulate charge carrier concentrations in conductive channels. Additionally, we have designed an adaptive parallel neural network (APNN) and applied it to autonomous driving image processing. This innovation significantly reduces adaptation time and notably enhances mean pixel accuracy (MPA) for semantic segmentation under overexposure and low-light conditions by 142.2% and 304.4%, respectively. Therefore, this work introduces new strategies for advanced adaptive vision, promising significant potential in intelligent driving and neuromorphic computing.

关键词： light-emitting synaptic devices homeostasis synapses adaptive functions parallel output of photoelectric signals semantic segmentation

来源：评论

学校读者我要写书评

暂无评论

5G Edge Vision: Wearable Assistive Technology for People with Blindness and Low Vision 25

5G Edge Vision: Wearable Assistive Technology for People wit...

引用

IEEE Wireless Communications and Networking conference (IEEE WCNC)

作者： Azzino, Tommy Mezzavilla, Marco Rangan, Sundeep Wang, Yao Rizzo, John-Ross NYU Tandon Sch Engn Brooklyn NY 11201 USA NYU Langone Dept Rehabil Med New York NY USA

ISBN: (纸本)9798350303582;9798350303599

In an increasingly visual world, people with blindness and low vision (pBLV) face substantial challenges in navigating their surroundings and interpreting visual information. From our previous work, (VISION)-I-4 is a smart wearable that helps pBLV in their daily challenges. It enables multiple microservices based on artificial intelligence (AI), such as visual scene processing, navigation, and vision-language inference. These microservices require powerful computational resources and, in some cases, stringent inference times, hence the need to offload computation to edge servers. This paper introduces a novel video streaming platform that improves the capabilities of (VISION)-I-4 by providing real-time support of the microservices at the network edge. When video is offloaded wirelessly to the edge, the time-varying nature of the wireless network requires adaptation strategies for a seamless video service. We demonstrate the performance of our adaptive real-time video streaming platform through experimentation with an open-source 5G deployment based on open air interface (OAI). The experiments demonstrate the ability to provide microservices robustly in time-varying network conditions.

关键词： 5G testbed AI assistive technology e-health wearable edge computing video streaming

来源：评论

学校读者我要写书评

暂无评论

Towards Railways Remote Driving: Analysis of video Streaming Latency and Adaptive Rate Control

Towards Railways Remote Driving: Analysis of Video Streaming...

引用

Joint European conference on Networks and Communications & 6G Summit (EuCNC/6G Summit)

作者： Mejias, Daniel Fernandez, Zaloa Viola, Roberto Aramburu, Ander Lopez, Igor Diaz, Andoni Fdn Vicomtech Basque Res & Technol Alliance San Sebastian 20009 Spain Construcc & Auxiliar Ferrocarriles I D CAF I D Beasain 20200 Spain

ISBN: (纸本)9798350344998;9798350345001

Remote driving aims to improve transport systems by promoting efficiency, sustainability, and accessibility. In the railway sector, remote driving makes it possible to increase flexibility, as the driver no longer has to be in the cab. However, this brings several challenges, as it has to provide at least the same level of safety obtained when the driver is in the cab. To achieve it, wireless networks and video streaming technologies gain importance as they should provide real-time track visualization and obstacle detection capabilities to the remote driver. Low latency camera capture, onboard media processing devices, and streaming protocols adapted for wireless links are the necessary enablers to be developed and integrated into the railway infrastructure. This paper compares video streaming protocols such as real-time Streaming Protocol (RTSP) and Web real-time Communication (WebRTC), as they are the main alternatives based on real-time Transport Protocol (RTP) protocol to enable low latency. As latency is the main performance metric, this paper also provides a solution to calculate the End-to-End video streaming latency analytically. Finally, the paper proposes a rate control algorithm to adapt the video stream depending on the network capacity. The objective is to keep the latency as low as possible while avoiding any visual artifacts. The proposed solutions are tested in different setups and scenarios to prove their effectiveness before the planned field testing.

关键词： Future Railway Mobile Communication Remote driving Low latency Streaming Adaptive Rate Control Network Monitoring

来源：评论

学校读者我要写书评

暂无评论

Identification and counting of fish targets using adaptive resolution imaging sonar

引用

JOURNAL OF FISH BIOLOGY 2024年第2期104卷 422-432页

作者： Shen, Wei Peng, Zhanfei Zhang, Jin Shanghai Ocean Univ Coll Marine Sci Shanghai Peoples R China Shanghai Ocean Univ Shanghai Estuary Marine Surveying & Mapping Engn T Shanghai Peoples R China Shanghai Ocean Univ Coll Marine Sci Shanghai 201306 Peoples R China

Fish are a critical component of marine biology;therefore, the accurate identification and counting of fish are essential for the objective monitoring and assessment of marine biological resources. High-frequency adaptive resolution imaging sonar (ARIS) is widely used for underwater object detection and imaging, and it quickly obtains close-up video of free-swimming fish in high-turbidity water environments. Nonetheless, processing the massive data output using imaging sonars remains a major challenge. Here, the authors developed an automatic image-processing programme that fuses K-nearest neighbour background subtraction with DeepSort target tracking to automatically track and count fish. The automatic programme was evaluated using four test data sets with different target sizes and observation ranges and differently deployed sonars. According to the results, the approach successfully counted free-swimming fish targets with an accuracy index of 73% and a completeness index of 70%. Under appropriate conditions, this approach could replace time-consuming semi-automatic approaches and improve the efficiency of imaging sonar data processing, while providing technical support for future real-time data processing.

关键词： acoustic image background subtraction algorithm fish target target identification

来源：评论

学校读者我要写书评

暂无评论

Development of a real-time eye movement-based computer interface for communication with improved accuracy for disabled people under natural head movements

引用

JOURNAL OF real-time image processing 2023年第4期20卷 81页

作者： Chhimpa, Govind Ram Kumar, Ajay Garhwal, Sunita Dhiraj Thapar Inst Engn & Technol Comp Sci & Engn Dept Patiala 147001 Punjab India CSIR Cent Elect Engn Res Inst Pilani Rajasthan India

In recent years, the scarcity of effective communication systems has been an essential issue for disabled people [physically disabled, locomotor disability, and amyotrophic lateral sclerosis (ALS)] who cannot speak, walk, or move their hands. The lives of disabled people depend on others for survival, so they need assistive technology to live independently. This research paper aims to develop an efficient real-time eye-gaze communication system using a low-cost webcam for disabled persons. This proposed work developed a video-Oculography (VOG) based system under natural head movements using a 5-point user-specific calibration (algorithmic calibration) approach for eye-tracking and cursor movement. During calibration, some parameters are calculated and used to control the computer with the eyes. Additionally, we designed a graphical user interface (GUI) to examine the performance and fulfill the basic daily needs of disabled individuals. The proposed method enables disabled persons to operate a computer by moving and blinking their eyes, similar to a typical computer user. The overall cost of the developed system is low (Cost < $50, varies based on camera usage) compared to the cost of various existing systems. The proposed system is tested with disabled and non-disabled individuals and has achieved an average blinking accuracy of 97.66%. The designed system has attained an average typing speed of 15 and 20 characters per minute for disabled and non-disabled participants, respectively. On average, the system has achieved a visual angle accuracy of 2.2 degrees for disabled participants and 0.8 degrees for non-disabled participants. The experiment's outcomes demonstrate that the developed system is robust and accurate.

关键词： Eye-gaze communication Human-computer interaction video-Oculography (VOG) Mouse control Calibration Disabled people

来源：评论

学校读者我要写书评

暂无评论

image generation of log ends and patches of log ends with controlled properties using generative adversarial networks

引用

SIGNAL image AND video processing 2024年第8-9期18卷 6481-6489页

作者： Bjornberg, Dag Ericsson, Morgan Lindeberg, Johan Lowe, Welf Nordqvist, Jonas Linnaeus Univ Dept Comp Sci & Media Technol Vaxjo Sweden Softwerk AB Vaxjo Sweden Linnaeus Univ Dept Forestry & Wood Technol Vaxjo Sweden Linnaeus Univ Dept Math Vaxjo Sweden

The appearance of the log cross-section provides important information when assessing the quality of the log, where properties to consider include pith location and density of annual rings. This makes tasks like estimation of pith location and annual ring detection of great interest. However, creating labeled training data for these tasks can be time-consuming and subject to misjudgments. For this reason, we aim to create generated training data with controlled properties of pith location and amount of annual rings. We propose a two-step generator based on generative adversarial networks in which we can completely avoid manual labeling, not only when generating training data but also during training of the generator itself. This opens up the possibility to train the generator on other types of log end data without the need to manually label new training data. The same method is used to create two generated training datasets;one of entire log ends and one of patches of log ends. To evaluate how the generated data compares to real data, we train two deep learning models to perform estimation of pith location and ring counting, respectively. The models are trained separately on real and generated data and evaluated on real data only. The results show that the performance of both estimation of pith location and ring counting can be improved by replacing real training data with larger sets of generated training data.

关键词： Generative adversarial network image generation of log ends Training data generation Conditional GAN CycleGAN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：