检索结果-内蒙古大学图书馆

CLIP-Flow:Decoding images encoded in CLIP space

Computational Visual Media 2024年第6期10卷 1157-1168页

作者： Hao Ma Ming Li Jingyuan Yang Or Patashnik Dani Lischinski Daniel Cohen-Or Hui Huang Visual Computing Research Center College of Computer Science and Software EngineeringShenzhen UniversityShenzhen 518060China Department of Computer Science Tel Aviv UniversityTel Aviv 6997801Israel School of Computer Science and Engineering the Hebrew University of JerusalemJerusalem 91904Israel

This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image *** particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such ***,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible *** the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image *** conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis *** addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image *** validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively.

关键词： image-to-image text-to-image contrastive language-image pretraining(CLIP) flow StyleGAN

来源：评论

学校读者我要写书评

暂无评论

Enhanced GRU-BiLSTM Technique for Crop Yield Prediction

引用

Multimedia Tools and Applications 2024年第41期83卷 89003-89028页

作者： Vashisht, Swati Kumar, Praveen Trivedi, Munesh Chandra Computer Science & Engineering Amity University Uttar Pradesh Noida201301 India Department of Computer Engineering Astana IT University Astana Kazakhstan Computer Science & Engineering NIT Agartala Tripura Agartala799046 India

Agriculture is the major source of food and significantly contributes to Indian employment, and the economy is intricately tied to the outcomes of crop management, where the final yield and market prices play crucial roles. The final yield and the market price completely determined the outcome of crop management or agriculture in India. Real-time observation emerges as a critical determinant of overall crop production success. Recognizing the significance of insightful analysis and precise crop yield predictions for effective farming practices, this study proposes an enhanced model to address the imperative of accurate yield forecasting. The pre-processing steps of the proposed model include Min-Max normalization, deletion of irrelevant data, and addition of missing values. The pre-processed data is then subjected to feature extraction using an Improved Shearlet transform (IST). After feature extraction, feature selection is done using an Enhanced multi-objective Grey Wolf optimization (EMGWO) technique. Finally, the prediction is made using an enhanced Gate Recurrent Unit with a Bidirectional LSTM (GRU-BiLSTM) model. This enhanced the accuracy (97%), precision (93%), recall (97.25%) and F-measure (95.14%) of agricultural yield predictions. Various measures related to errors, such as RMSE, MSE, MAE, MedAE, R2 and MSLE, are compared for the proposed model and other existing techniques. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Enhancing energy efficiency: a protocol assessment in multi-hop mesh-based IOUT networks

引用

Multimedia Tools and Applications 2024年第37期83卷 84999-85026页

作者： Sisodia, Ankur Vishnoi, Swati Upadhyay, Sachin Goswami, Jayati Krishna Khenwar, Medha Yadav, Ajay Kumar Department of Computer Engineering and Applications GLA University Mathura India Department of Computer Science and Engineering GL Bajaj Group of Institutions Mathura India Department of Computer Science Banasthali Vidyapith Rajasthan Jaipur India

IOUT (Internet of Underwater Things) relies on underwater acoustic sensors, which have limited resources such as battery power and bandwidth. The exchange of data among these sensors faces challenges like propagation delay, node displacement, and environmental errors, making network maintenance difficult. The objective of this study is to address the energy efficiency and performance issues in IOUT networks by proposing and evaluating an energy-efficient routing protocol called Efficient Cost Wakeup Routing Protocol (ECWRP). To achieve the objective, the study focuses on two key parameters: Cost and Duty Cycle. The Duty Cycle parameter helps in reducing undesirable impacts during underwater communications, improving the performance of the routing protocol. The Cost parameter is utilized to select the most efficient path for data transmission, considering factors such as transmitting power levels. The protocol is applied to a multi-hop mesh-based network. The proposed ECWRP routing protocol is assessed through simulations, demonstrating its superior efficiency compared to the Ride algorithm. By eliminating unnecessary handshaking and optimizing route selection, ECWRP significantly enhances energy efficiency and overall performance within the IoUT network. The study's findings on the enhanced energy efficiency and performance improvements achieved by the ECWRP protocol hold promising implications for the design and optimization of IoUT networks, paving the way for more sustainable and effective communication systems in underwater environments. In conclusion, the study demonstrates the effectiveness of the Efficient Cost Wakeup Routing Protocol (ECWRP) in enhancing energy efficiency and performance in multi-hop mesh-based IoUT networks. The protocol's utilization of the Duty Cycle parameter reduces undesirable impacts, while the Cost parameter enables the selection of the most efficient path for data transmission. The results confirm the superiority of the ECWRP protoc

关键词： Energy efficiency

来源：评论

学校读者我要写书评

暂无评论

Agricultural supply chain management using hyperledger and AIOT

引用

Journal of Ambient Intelligence and Humanized Computing 2025年第4期16卷 471-485页

作者： Jha, Anurag Kumar Raj, Aparna Jha, Ashish Kumar Shetty, Sujala D. Department of Computer Science Bits-Pilani Dubai Campus Dubai United Arab Emirates Department of Electrical and Computer Engineering Carnegie Mellon University PittsburghPA United States

Supply chain management and Hyperledger are two interconnected domains. They leverage blockchain technology to enhance efficiency, transparency, and security in supply chain operations. Together, they provide a decentralized, traceable, and real-time platform for recording and managing transactions. This combination is particularly valuable for industries dealing with sensitive goods, as it provides accurate traceability and real-time information. This paper explores the integration of supply chain management with Hyperledger blockchain technology to enhance efficiency, transparency, and security in supply chain operations. We propose a decentralized Hyperledger Fabric blockchain network to improve traceability, security, and efficiency by monitoring environmental conditions. This approach is particularly beneficial for transporting sensitive goods, such as medical supplies and perishable items, by ensuring optimal conditions and real-time data accessibility. The integration of Artificial Intelligence (AI) further enhances insights, reduces waste, and improves overall efficiency. By utilizing a distributed network free from third-party intermediaries, the system ensures immutability and remote accessibility, addressing challenges related to transporting heat and humidity sensitive products. Our experimental assessment demonstrates the benefits of private blockchain technologies, including enhanced security, regulatory compliance, compatibility, flexibility, and scalability. This study presents a detailed methodology for developing a traceable, efficient, and sustainable agricultural supply chain. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

关键词： Efficiency

来源：评论

学校读者我要写书评

暂无评论

Text-driven clothed human image synthesis with 3D human model estimation for assistance in shopping

引用

Multimedia Tools and Applications 2025年第1期84卷 167-200页

作者： Karkuzhali, S. Aasim, A. Syed StalinRaj, A. Department of Computer Science and Engineering Mepco Schlenk Engineering College Tamil Nadu Sivakasi India

Online shopping has become an integral part of modern consumer culture. Yet, it is plagued by challenges in visualizing clothing items based on textual descriptions and estimating their fit on individual body types. In this work, we present an innovative solution to address these challenges through text-driven clothed human image synthesis with 3D human model estimation, leveraging the power of Vector Quantized Variational AutoEncoder (VQ-VAE). Creating diverse and high-quality human images is a crucial yet difficult undertaking in vision and graphics. With the wide variety of clothing designs and textures, existing generative models are often not sufficient for the end user. In this proposed work, we introduce a solution that is provided by various datasets passed through several models so the optimized solution can be provided along with high-quality images with a range of postures. We use two distinct procedures to create full-body 2D human photographs starting from a predetermined human posture. 1) The provided human pose is first converted to a human parsing map with some sentences that describe the shapes of clothing. 2) The model developed is then given further information about the textures of clothing as an input to produce the final human image. The model is split into two different sections the first one being a codebook at a coarse level that deals with overall results and a fine-level codebook that deals with minute detailing. As mentioned previously at fine level concentrates on the minutiae of textures, whereas the codebook at the coarse level covers the depictions of textures in structures. The decoder trained together with hierarchical codebooks converts the anticipated indices at various levels to human images. The created image can be dependent on the fine-grained text input thanks to the utilization of a blend of experts. The quality of clothing textures is refined by the forecast for finer-level indexes. Implementing these strategies can result

关键词： Vector quantization

来源：评论

学校读者我要写书评

暂无评论

Low-light image enhancement using the illumination boost algorithm along with the SKWGIF method

引用

Multimedia Tools and Applications 2025年第17期84卷 18651-18685页

作者： Radmand, Elnaz Saberi, Erfan Sorkhi, Ali Ghanbari Pirgazi, Jamshid Department of Computer Engineering University of Science and Technology of Mazandaran Behshahr Iran

Low-light image enhancement is highly desirable for outdoor image processing and computer vision applications. Research conducted in recent years has shown that images taken in low-light conditions often pose two main problems, the first of which is low visibility (i.e., small pixel intensities). Secondly, due to the low signal-to-noise ratio, noise also becomes prominent and obscures the image content. For this reason, images with low noise are usually employed in this application, a practice not possible in the real world. In this regard, a hybrid method is proposed which is based on the illumination boost algorithm (IBA) and weighted guided image filtering with steering kernel (SKWGIF). IBA is used to raise the values of low and medium intensity pixels while preventing excessive increases in high-value pixels. SKWGIF is employed to denoise and augment image details by adjusting its parameters based on the output of the previous step. The proposed method utilizes the LOL v1, LOL v2 and ExDark datasets Our hybrid method’s comprehensive approach to low-light image enhancement produced better outcomes than previous methods. Quantitative results have shown that the proposed model outperforms state-of-the-art (SOTA) models on the LOL v1 dataset, achieving a PSNR value of 18.80. Through the successful resolution of visibility and noise reduction issues, our approach led to notable gains in image quality measures, including SSIM, PSNR, BRISQUE, and NIQE. IBA’s ability to selectively boost intensity improved image visibility without producing overexposure artifacts, and SKWGIF effectively reduced noise while enhancing image details. The combined effect produced improved image quality that was superior to that of single techniques or already-used hybrid approaches. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

Deep learning approaches to address cold start and long tail challenges in recommendation systems: a systematic review

引用

Multimedia Tools and Applications 2025年第5期84卷 2293-2325页

作者： Jangid, Manisha Kumar, Rakesh Department of Computer Science & Engineering Central University of Haryana Haryana India

Recommendation systems (RS) have become prevalent across different domains including music, e-commerce, e-learning, entertainment, and social media to address the issue of information overload. While traditional RS approaches have achieved significant success in delivering recommendations, they still face issues including sparse data, diversity, cold start, and long tail problem. The emergence of deep learning as a prominent and extensively studied topic has shown significant potential in addressing these challenges in RS. Deep learning captures intricate patterns of interaction and precisely reflects user preferences, allowing for encoding complex abstractions in data representation and enhancing information processing capabilities. This paper provides an extensive survey of the existing literature on recommendation systems. We will begin by providing a foundational understanding of the core concepts and terminology of recommendation systems and significance of deep learning. Secondly, we talk about the original studies being conducted on deep learning methods and solutions to address "Cold start and long tail" challenges in recommendation. Thirdly, we examine the potential future directions of research pertaining to deep learning-based recommender systems (DLRS). Our review provides valuable insights for both researchers and practitioners in using deep learning to address challenges in recommendation and in developing effective and efficient recommendation systems. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

A Communication-Based Solution to Detect Islanding using Correlation Element in Distributed Generation Environment

引用

Distributed Generation and Alternative Energy Journal 2025年第2期40卷 427-456页

作者： Ranjan, Sanjeev Kumar, Munna Kumar, Jitendra Mahanty, R.N. Sood, Vijay K. Department of Electrical Engineering NIT Jamshedpur Jharkhand 831014 India Department of Electrical Engineering ARMIET Maharashtra Thane421601 India Department of Electrical Computer and Software Engineering Ontario Tech University ON Canada

Issues regarding safety, circuit breaker reclosing, power quality, and regulatory compliance are identified when islanding is to be detected in a microgrid. In this paper, a novel communication-based, passive islanding detection method (IDM) is proposed to identify islanding in a microgrid to address these issues. This proposed method is based on correlation using the impedance measurement at the point of common coupling (PCC) and distributed generation (DG). The methodology is validated on a modified IEEE-13 bus system through a Phasor Measurement Unit (PMU) with a set threshold to discriminate between islanding and non-islanding events. The benefits of this proposed method are fast and accurate islanding detection. This IDM can tackle all the concerns regarding islanding detection in the cases of active power mismatch (APM), reactive power mismatch (RPM), DG disconnection with the presence of noise, unbalanced loads, irradiance change, weak and/or strong grid without providing any false signal as per IEEE UL1741 and IEEE STD. 929-2000. The authentication of the proposed scheme is also carried out for non-islanding events such as altered faults, non-linear loads, load switching, capacitor and inductor switching, feeder disconnection, and motor swapping, where all tests endorse the applicability of the proposed technique. The proposed methodology is validated both with simulation and Opal-RT laboratory results. © 2025 River Publishers.

关键词： Electric impedance measurement

来源：评论

学校读者我要写书评

暂无评论

MindScore: quantifying human preference for text-to-image generation through multi-view lens

引用

science China(Information sciences) 2025年第6期68卷 72-85页

作者： Yiqi TONG Jiarui ZHANG Shaohang WEI Wei GUO Fuzhen ZHUANG Deqing WANG Xi YANG Richeng XUAN School of Artificial Intelligence Beihang University School of Computer Science and Engineering Beihang University Department of Computer Science and Engineering Shanghai Jiao Tong University School of Computer Science Peking University State Key Laboratory of Complex & Critical Software Environment Beihang University Beijing Academy of Artificial Intelligence

Understanding and quantifying the capabilities of foundation models, particularly in text-to-image(T2I) generation, is crucial for verifying their alignment with human expectations and practical requirements. However, evaluating T2I foundation models presents significant challenges due to the complex, multi-dimensional psychological factors that influence human preferences for generated images. In this work, we propose MindScore, a multi-view framework for assessing the generation capacity of T2I models through the lens of human preference. Specifically, MindScore decomposes the evaluation into four complementary modules that align with human cognitive processing of images: matching, faithfulness, quality,and realness. The matching module quantifies the semantic alignment between generated images and prompt text, while the faithfulness module measures how accurately the images reflect specific prompt details. Furthermore, we incorporate quality and realness modules to capture deeper psychological preferences, recognizing that unpleasant or distorted images often trigger adverse human responses. Extensive experiments on three T2I datasets with human preference annotations clearly validate the superiority of our proposed MindScore over various state-of-the-art baselines. Our case studies further reveal that MindScore offers valuable insights into T2I generation from a human-centric perspective.

关键词： text-to-image generation foundation models human preference evaluation multi-view assessment language and vision

来源：评论

学校读者我要写书评

暂无评论

An ensemble based approach for violence detection in videos using deep transfer learning

引用

Multimedia Tools and Applications 2025年第12期84卷 11001-11025页

作者： Kaur, Gurmeet Singh, Sarbjeet Department of Computer Science and Engineering UIET Panjab University Chandigarh India

The detection of violence in videos has become an extremely valuable application in real-life situations, which aim to maintain and protect people’s safety. Despite the complexities inherent in videos and the abrupt nature of violent actions, the field has seen several approaches, yet achieving consistent performance remains elusive, especially with advanced real-life datasets. Presenting a solution, the paper proposes a Bagging ensemble based approach comprising three pretrained models integrated with stacked Long Short-Term Memory (LSTM) to enhance individual model performance. This ensemble approach is rigorously analyzed on two publicly accessible datasets, RLVS and RWF-2000, providing remarkable accuracy (96.6%, 92.7%) and F1-scores (96.6%, 93.0%). Additionally, a cross-dataset analysis demonstrates the model’s ability to generalize across diverse datasets. Furthermore, a study of ablation highlighting the efficacy and optimal selection of components in augmenting the proposed ensemble’s efficiency. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：