检索结果-内蒙古大学图书馆

Research and Application of U²-NetP Network Incorporating Coordinate Attention for Ship Draft Reading in Complex Situations

引用

JOURNAL OF signal processing SYSTEMS FOR signal image AND VIDEO TECHNOLOGY 2023年第2-3期95卷 177-195页

作者： Li, Weihao Zhan, Wei Han, Tao Wang, Peiwen Liu, Hu Xiong, Mengyuan Hong, Shengbing Yangtze Univ Sch Comp Sci Jingzhou 434023 Hubei Peoples R China

Ship draft reading is an essential link to the draft survey. At present, manual observation is primarily used to determine a ship's draft. However, manual observation is easily affected by complex situations such as Large waves on the water, Water obstacles, Water traces, Tilted draft characters, and Rusted draft characters. Traditional image-based methods of ship draft reading are difficult to adapt to these complex situations, and existing deep learning-based methods have disadvantages such as the poor robustness of ship draft reading in various complex situations. In this paper, we proposed a method that combines image processing and deep learning and is capable of adapting to a variety of complex situations, particularly in the presence of Large waves on the water and Water obstacles. We also propose a small U-2-NetP neural network for semantic segmentation that incorporates Coordinate attention, hence enhancing the capture of information regarding spatial locations. Furthermore, its segmentation accuracy reached 96.47% compared with the original network. In addition, in consideration of the combination of lightweight and multitasking of the method, we use the lightweight Yolov5n network architecture to detect the ship draft characters, which achieves 98% of mAP_0.5 and effectively improves the lightweight of the draft reading. Experimental results on a real dataset encompassing many difficult situations illustrate the state-of-the-art performance of the suggested reading approach when compared to other existing deep learning methods. The average inaccuracy of the draft reading is less than +/- 0.005 m, and millimeter-level precision is achievable. It can serve as a valuable resource for manual reading. In addition, our work lays the groundwork for future research on the deployment of edge devices.

关键词： Deep learning Semantic segmentation Coordinate Attention Draft mark detection Waterline extraction image processing

来源：评论

学校读者我要写书评

暂无评论

CMCL: Cross-Modal Compressive Learning for Resource-Constrained Intelligent IoT Systems

引用

IEEE INTERNET OF THINGS JOURNAL 2024年第15期11卷 25534-25542页

作者： Chen, Bin Tang, Dong Huang, Yujun An, Baoyi Wang, Yaowei Wang, Xuan Harbin Inst Technol Shenzhen Sch Comp Sci & Technol Shenzhen 518055 Guangdong Peoples R China Guangdong Prov Key Lab Novel Secur Intelligence Te Shenzhen Peoples R China Peng Cheng Lab Res Ctr Artificial Intelligence Shenzhen 518055 Guangdong Peoples R China Tsinghua Univ Tsinghua Shenzhen Int Grad Sch Shenzhen 518055 Guangdong Peoples R China Huawei Technol Co Ltd Network Technol Lab Shenzhen 518055 Peoples R China

Compressive learning (CL) has proven to be highly successful in executing joint signal sampling and inference for intricate vision tasks through resource-limited Internet of Things (IoT) devices. Recent studies have turned their attention toward utilizing the deep neural networks (DNNs) methodology, also known as DeepCL, to enhance performance in unimodal vision tasks. This approach incorporates learnable compressed sensing in a comprehensive, end-to-end manner. Current DeepCL techniques typically employ initial signal reconstruction as the input for subsequent DNNs for inference. However, this practice presents potential risks, such as privacy breaches and reduced performance due to information processing inequality. To address these issues, this article introduces the first cross-modal CL (CMCL) approach that enables image captioning directly on compressed measurements. When compared to previous DeepCL strategies, the proposed CMCL offers significant improvements in computational efficiency and privacy protection. Extensive experiments demonstrate that CMCL performance is nearly on par with leading image captioning methods, showcasing a metric value that is merely 2.75% lower than the uncompressed method when the data is compressed eightfold.

关键词： Compressed sensing Task analysis Internet of Things image coding Transformers Servers Privacy Compressive learning (CL) image captioning Internet of Things (IoT) transformer

来源：评论

学校读者我要写书评

暂无评论

Adaptive image enhancement technology based on bad weather 4

Adaptive image enhancement technology based on bad weather

引用

4th International Conference on Advanced Algorithms and signal image processing, AASIP 2024

作者： Chang, Jiaxiu Hou, Wenshuai Chen, Wenjing Yan, Jun Cui, Kaige College of Transportation Shandong University of Science and Technology Qingdao266590 China

ISBN: (数字)9781510682818

ISBN: (纸本)9781510682801

To tackle the formidable challenges that adverse weather conditions pose for image object detection, this paper presents an innovative approach grounded in the image Adaptive YOLO (IA-YOLO) framework. The framework has introduced a series of advanced strategies to address the challenges to the accuracy and reliability of object recognition under extreme weather conditions. In environments where visibility is reduced due to factors like rain, fog, or low light, traditional object detection methods often struggle to achieve satisfactory results. However, IA-YOLO aims to overcome these limitations by incorporating adaptive image enhancement techniques that can effectively improve the quality of captured images. By embedding a unique image refinement mechanism within an efficient convolutional neural network, IA-YOLO empowers the system to autonomously acquire superior parameters for image refinement through a minimally supervised learning approach. This approach ensures that the images are enhanced in a way that is specifically tailored to improve object detection performance. To encapsulate, the article presents IA-YOLO as an influential instrument for tackling the obstacles posed by inclement weather in the realm of image object detection. By leveraging adaptive image enhancement techniques, IA-YOLO aims to provide more accurate and reliable detections, even in the most challenging weather scenarios. © 2024 SPIE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Surface defect identification method for hot-rolled steel plates based on random data balancing and lightweight convolutional neural network

引用

signal image AND VIDEO processing 2024年第8-9期18卷 5775-5786页

作者： Zeng, Weihui Wang, Junyan Chen, Peng Zhong, Zhimin Hu, Gensheng Bao, Wenxia Anhui Univ Sch Internet Hefei 230039 Peoples R China AIMS Technol Co Ltd Hefei 230088 Peoples R China Anhui Univ Natl Engn Res Ctr Agroecol Big Data Anal & Applica Hefei 230601 Peoples R China

Hot-rolled strip steel is an extremely important industrial foundational material. The rapid and precise identification of surface defects in hot-rolled strip steel is beneficial for enhancing the quality of steel materials and reducing economic losses. Current research primarily focuses on using convolutional neural networks (CNNs) for strip steel surface defect identification. Although the accuracy of identification has remarkably improved in comparison with traditional machine learning methods, it has overlooked issues related to dataset preprocessing and the problem of nonlightweight CNN models with large model parameters and high computational complexity. To address the abovementioned issues, this study proposes a hot-rolled steel strip surface defect identification method based on random data balancing and the lightweight CNN MobileNet-Pro. Random data balancing employs image augmentation to eliminate the differences in the quantity of categories between the hot-rolled strip steel surface defect data, providing diverse images to alleviate overfitting during model training. MobileNet-Pro is used to increase the model's effective receptive field. Building upon MobileNetV1, it introduces large convolutional kernels and improves depth-wise separable convolution. Experiments show that the new MobileNet-Pro, after random data balancing on the X-SDD dataset, achieves an accuracy of 96.47%, surpassing RepVGG + SA (95.10% accuracy, nonlightweight) and ResNet50 (93.86% accuracy, nonlightweight). Additionally, MobileNet-Pro outperforms mainstream lightweight networks from the MobileNet series, ShuffleNetV2, and GhostnetV2 in terms of performance on the CIFAR-100 and PASCAL VOC 2007 datasets, demonstrating excellent generalization capabilities. All our code and models are available on GitHub: https://***/OnlyForWW/MobileNet-Pro.

关键词： Hot-rolled strip steel surface defect identification Random data balancing Lightweight convolutional neural network Effective receptive field Large convolutional kernel

来源：评论

学校读者我要写书评

暂无评论

An image compression and encryption scheme for similarity retrieval

引用

signal processing-image COMMUNICATION 2023年第1期119卷

作者： Meng, Ke Wo, Yan South China Univ Technol Coll Comp Sci & Engn Guangzhou 510641 Peoples R China

With the development of cloud computing, people usually outsource encrypted images for saving storage and protecting privacy. However, traditional image encryption methods not only hinder the availability of images such as similarity retrieval, but also degrade the compression performance. To address this issue, we propose a retrievable image compression and encryption method(RICE). RICE takes into account the contradiction of image compression, availability and security, then propose a cascaded information bottleneck model, which includes the compression information bottleneck and the security and availability information bottleneck. The former is converted into a rate distortion problem and its optimal solution is sought by a convolutional neural network(CNN)-based compression network which includes channel space attention module and discrete wavelet transform(DWT) module. To solve the later, we propose a feature partition method to find a retrieval subset that balances the contradiction between security and availability, and design a DNA-based deterministic encryption method for this subset to support ciphertext retrieval. The ciphertext of the retrieved subset is sent to the proposed similarity search fully connected network(SimFcNet) to improve the retrieval accuracy. The remaining subset is encrypted by Non-deterministic encryption to further improve security. In general, the method RICE we proposed supports similarity retrievable in compressed domain ciphertext, and can achieve excellent performance. Experimental results show that our method is 36.56% higher than JPEG2000 at compression ratio of 60:1 in MS-SSIM, the accuracy of ciphertext retrieval can reach 0.828, and the security of ciphertext is close to that of traditional encryption methods.

关键词： Information bottleneck Compressed domain retrieval Retrieval encryption Cascade compression encryption image similarity retrieval

来源：评论

学校读者我要写书评

暂无评论

SELF-SUPERVISED LEARNING FOR CONTEXT-INDEPENDENT DFD NETWORK USING MULTI-VIEW RANK SUPERVISION 30

SELF-SUPERVISED LEARNING FOR CONTEXT-INDEPENDENT DFD NETWORK...

引用

30th IEEE International Conference on image processing (ICIP)

作者： Mishima, Nao Seki, Akihito Hiura, Shinsaku Toshiba Co Ltd Corp Res & Dev Ctr Tokyo Japan Univ Hyogo Kobe Hyogo Japan

ISBN: (纸本)9781728198354

Although context-based monocular depth estimation has shown remarkable improvement, the adaptation to unseen contexts is still a major challenge. On the other hand, the use of physical depth cues, such as defocus associated with lens aberration, allows context-independent depth estimation. However, explicitly supervising physical depth cues would have a significant impact on cost and versatility, because of the need to use expensive equipment to obtain the ground truth. Therefore, we propose a novel self-supervised learning for a single-shot neural depth from defocus (DfD) utilizing structure from motion (SfM) images taken by the target lens. Since the scale of SfM depth is ambiguous, we used rank loss to train the network. To demonstrate the versatility of our method, we conducted validation experiments using not only DSLR cameras but also smartphones with small image sensors. We confirmed that our method is highly accurate by a large margin over state-of-the-art methods including the physically-calibrated neural single-shot DfD and context-based methods.

关键词： Monocular depth estimation depth from defocus multi-view image rank loss

来源：评论

学校读者我要写书评

暂无评论

RM-UNet: UNet-like Mamba with rotational SSM module for medical image segmentation

引用

signal image AND VIDEO processing 2024年第11期18卷 8427-8443页

作者： Tang, Hao Huang, Guoheng Cheng, Lianglun Yuan, Xiaochen Tao, Qi Chen, Xuhang Zhong, Guo Yang, Xiaohui Guangdong Univ Technol Sch Comp Sci & Technol Guangzhou 510006 Peoples R China Macao Polytech Univ Fac Appl Sci Macau 999078 Peoples R China Guangdong Technion Israel Inst Technol Dept Mech Engn Robot Shantou 515063 Peoples R China Huizhou Univ Sch Comp Sci & Engn Huizhou 516007 Peoples R China Guangdong Univ Foreign Studies Sch Informat Sci & Technol Guangzhou 510006 Peoples R China Sun Yat sen Univ Affiliated Hosp 3 Dept Gynecol Guangzhou Peoples R China

Accurate segmentation of tissues and lesions is crucial for disease diagnosis, treatment planning, and surgical navigation. Yet, the complexity of medical images presents significant challenges for traditional Convolutional neural Networks and Transformer models due to their limited receptive fields or high computational complexity. State Space Models (SSMs) have recently shown notable vision performance, particularly Mamba and its variants. However, their feature extraction methods may not be sufficiently effective and retain some redundant structures, leaving room for parameter reduction. In response to these challenges, we introduce a methodology called Rotational Mamba-UNet, characterized by Residual Visual State Space (ResVSS) block and Rotational SSM Module. The ResVSS block is devised to mitigate network degradation caused by the diminishing efficacy of information transfer from shallower to deeper layers. Meanwhile, the Rotational SSM Module is devised to tackle the challenges associated with channel feature extraction within State Space Models. Finally, we propose a weighted multi-level loss function, which fully leverages the outputs of the decoder's three stages for supervision. We conducted experiments on ISIC17, ISIC18, CVC-300, Kvasir-SEG, CVC-ColonDB, Kvasir-Instrument datasets, and Low-grade Squamous Intraepithelial Lesion datasets provided by The Third Affiliated Hospital of Sun Yat-sen University, demonstrating the superior segmentation performance of our proposed RM-UNet. Additionally, compared to the previous VM-UNet, our model achieves a one-third reduction in parameters. Our code is available at https://***/Halo2Tang/RM-UNet.

关键词： U-Net State Space Models Medical image segmentation Mamba LSIL

来源：评论

学校读者我要写书评

暂无评论

signal Detection in Three-Dimensional Confocal Microscopy images through Deep Learning 18

Signal Detection in Three-Dimensional Confocal Microscopy Im...

引用

18th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)

作者： Paulik, Robert Kozlovszky, Miklos Molnar, Bela Obuda Univ 3DHISTECH Ltd Budapest Hungary Obuda Univ BioTech Res Ctr Budapest Hungary 3DHISTECH Ltd Image Anal Dept Budapest Hungary

ISBN: (纸本)9798350329537;9798350329520

As routine pathology moves into the digital age;the spread of high-efficiency and high-resolution tissue scanners opens up the possibility of routine analysis of three-dimensional samples containing fluorescent genetic signals. One of these cornerstones is confocal microscopy, with the help of which cell nuclei and their signals can be imaged in three dimensions. This article presents a novel deep learning-based algorithm for detecting signals within three-dimensional confocal microscopy images. By leveraging the power of convolutional neural networks, our approach significantly improves the accuracy and efficiency of signal detection compared to traditional image processing methods, especially in the case of thick sections. We demonstrate the algorithmic effectiveness through validation on various samples, highlighting its potential to advance research in biology and medicine.

关键词： confocal microscope 3D nuclei signal detection convolutional neural network deep learning U-Net

来源：评论

学校读者我要写书评

暂无评论

SSVEP-EEG Feature Enhancement Method Using an image Sharpening Filter

引用

IEEE TRANSACTIONS ON neural SYSTEMS AND REHABILITATION ENGINEERING 2022年 30卷 115-123页

作者： Yan, Wenqiang Xu, Guanghua Du, Yuhui Chen, Xiaobi Xi An Jiao Tong Univ Sch Mech Engn State Key Lab Mfg Syst Engn Xian 710049 Peoples R China Xi An Jiao Tong Univ Sch Mech Engn Xian 710049 Peoples R China

Steady-state visual evoked potential (SSVEP) is widely used in brain computer interface (BCI), medical detection, and neuroscience, so there is significant interest in enhancing SSVEP features via signal processing for better performance. In this study, an image processing method was combined with brain signal analysis and a sharpening filter was used to extract image details and features for the enhancement of SSVEP features. The results demonstrated that sharpening filter could eliminate the SSVEP signal trend term and suppress its low-frequency component. Meanwhile, sharpening filter effectively enhanced the signal-to-noise ratios (SNRs) of the single-channel and multi-channel fused signals. image sharpening filter also significantly improved the recognition accuracy of canonical correlation analysis (CCA), filter bank canonical correlation analysis (FBCCA), and task-related component analysis (TRCA). The tools developed here effectively enhanced the SSVEP signal features, suggesting that image processing methods can be considered for improved brain signal analysis.

关键词： Correlation Optical filters Licenses Feature extraction Information filters Electroencephalography Voltage measurement Brain-computer interface steady-state visual evoked potential image sharpening filter feature enhancement spectrum analysis

来源：评论

学校读者我要写书评

暂无评论

SSLCT: A Convolutional Transformer for Synthetic Speech Localization 7

SSLCT: A Convolutional Transformer for Synthetic Speech Loca...

引用

7th IEEE International Conference on Multimedia Information processing and Retrieval (MIPR)

作者： Bhagtani, Kratika Yadav, Amit Kumar Singh Bestagini, Paolo Delp, Edward J. Purdue Univ Video & Image Proc Lab VIPER Sch Elect & Comp Engn W Lafayette IN 47907 USA Politecn Milan Dipartimento Elettron Informaz & Bioingn Milan Italy

ISBN: (纸本)9798350351439;9798350351422

Deep learning methods can now generate high quality synthetic speech which is perceptually indistinguishable from real speech. As synthetic speech can be used for nefarious purposes, speech forensics methods to detect fully synthetic speech have been developed. Speech editing tools can also create partially synthetic speech in which only a part of the speech signal is synthetic. Detecting these short synthetic segments within a speech signal requires specialized methods to determine the temporal location of the synthetic speech. In this paper, we propose the Synthetic Speech Localization Convolutional Transformer (SSLCT), a neural network and transformer method for synthetic speech localization. SSLCT can temporally localize synthetic speech segments as small as 20 milliseconds. We demonstrate that SSLCT achieves less than 10% Equal Error Rate (EER), which is an improvement over several existing methods.

关键词： Synthetic speech localization speech forensics deepfake speech PartialSpoof transformer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：