检索结果-内蒙古大学图书馆

MMInstruct: a high-quality multi-modal instruction tuning dataset with extensive diversity

science China(Information sciences) 2024年第12期67卷 36-51页

作者： Yangzhou LIU Yue CAO Zhangwei GAO Weiyun WANG Zhe CHEN Wenhai WANG Hao TIAN Lewei LU Xizhou ZHU Tong LU Yu QIAO Jifeng DAI School of Computer Science Nanjing University School of Electronic Information and Electrical Engineering Shanghai Jiao Tong University Shanghai AI Laboratory School of Computer Science Fudan University Department of Information Engineering The Chinese University of Hong Kong SenseTime Research Department of Electronic Engineering Tsinghua University

Despite the effectiveness of vision-language supervised fine-tuning in enhancing the performance of vision large language models(VLLMs), existing visual instruction tuning datasets include the following limitations.(1) Instruction annotation quality: despite existing VLLMs exhibiting strong performance,instructions generated by those advanced VLLMs may still suffer from inaccuracies, such as hallucinations.(2) Instructions and image diversity: the limited range of instruction types and the lack of diversity in image data may impact the model's ability to generate diversified and closer to real-world scenarios outputs. To address these challenges, we construct a high-quality, diverse visual instruction tuning dataset MMInstruct,which consists of 973k instructions from 24 domains. There are four instruction types: judgment, multiplechoice, long visual question answering, and short visual question answering. To construct MMInstruct, we propose an instruction generation data engine that leverages GPT-4V, GPT-3.5, and manual correction. Our instruction generation engine enables semi-automatic, low-cost, and multi-domain instruction generation at 1/6 the cost of manual construction. Through extensive experiment validation and ablation experiments,we demonstrate that MMInstruct could significantly improve the performance of VLLMs, e.g., the model fine-tuning on MMInstruct achieves new state-of-the-art performance on 10 out of 12 benchmarks. The code and data shall be available at https://***/yuecao0119/MMInstruct.

关键词： instruction tuning multi-modal multi-domain dataset vision large language model

来源：评论

学校读者我要写书评

暂无评论

Representing a Model for the Anonymization of Big Data Stream Using In-Memory Processing

引用

Annals of Data science 2025年第1期12卷 223-252页

作者： Shamsinejad, Elham Banirostam, Touraj Pedram, Mir Mohsen Rahmani, Amir Masoud Department of Computer Engineering Central Tehran Branch Islamic Azad University Tehran Iran Department of Electrical and Computer Engineering Faculty of Engineering Kharazmi University Tehran Iran Department of Computer Engineering Science and Research Branch Islamic Azad University Tehran Iran

In light of the escalating privacy risks in the big data era, this paper introduces an innovative model for the anonymization of big data streams, leveraging in-memory processing within the Spark framework. The approach is founded on the principle of K-anonymity and propels the field forward by critically evaluating various anonymization methods and algorithms, benchmarking their performance with respect to time and space complexities. A distinctive formula for optimized cluster determination in the K-means algorithm is presented, along with a novel tuple expiration time strategy for the efficient purging of clusters. The integration of these components into Spark’s RDD and MLlib modules results in a significant decrease in execution time and data loss rates, even with increasing data volumes. The paper’s notable contributions are its methodological advancements that offer a robust, scalable solution for data anonymization, safeguarding user privacy without sacrificing data utility or processing efficiency. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

关键词： Big data

来源：评论

学校读者我要写书评

暂无评论

Embeddings Between State and Action Based Probabilistic Logics

引用

Formal Aspects of Computing 2025年第2期37卷 1-58页

作者： Das, Susmoy Sharma, Arpit Department of Electrical Engineering and Computer Science Indian Institute of Science Education and Research Bhopal Madhya Pradesh Bhopal India

This article defines embeddings between state-based and action-based probabilistic logics which can be used to support probabilistic model checking. First, we slightly modify the model embeddings proposed in the literature to allow invisible computation steps and the preservation of forward and backward bisimulation relations. Next, we propose the syntax and semantics of an action-based Probabilistic Computation Tree Logic (APCTL) and an action-based PCTL∗(APCTL∗) interpreted over action-labeled discrete-time Markov chains (ADTMCs). We show that both these logics are strictly more expressive than the probabilistic variant of Hennessy-Milner logic (prHML). We define an embedding aldl which can be used to construct APCTL∗formulae from PCTL∗formulae and an embedding sldl from APCTL∗formulae to PCTL∗formulae. Similarly, we define the embeddings and from PCTL to APCTL and APCTL to PCTL, respectively. We also define the reward-based variant of APCTL (APRCTL) interpreted over action-based Markov Reward Models (AMRM), and accordingly modify the logical embeddings and which allows us to take into account the notion of rewards. Additionally, we also show that the idea of rewards can be used to reason about the bounded until operator in PCTL and APCTL. Finally, we prove that our logical embeddings combined with the model embeddings enable one to minimize, analyze, and verify probabilistic models in one domain using state-of-the-art tools and techniques developed for the other domain. In order to validate the efficacy of our theoretical framework, we apply it to two case studies using the probabilistic symbolic model checker (PRISM). © 2025 Copyright held by the owner/author(s). Publication rights licensed to ACM.

关键词： Probabilistic Markov chain equivalence bisimulation logic embeddings rewards model checking

来源：评论

学校读者我要写书评

暂无评论

PS-CoT-Adapter: adapting plan-and-solve chain-of-thought for scienceQA

引用

science China(Information sciences) 2025年第1期68卷 392-393页

作者： Qun LI Haixin SUN Fu XIAO Yiming WANG Xinping GAO Bir BHANU School of Computer Science Nanjing University of Posts and Telecommunications Purple Mountain Laboratories Department of Electrical and Computer Engineering University of California at Riverside

Large language models (LLMs) have recently shown remarkable performance in a variety of natural language processing (NLP) *** further explore LLMs'reasoning abilities in solving complex problems,recent research [1-3]has investigated chain-of-thought (CoT) reasoning in complex multimodal scenarios,such as science question answering (scienceQA) tasks [4],by fine-tuning multimodal models through human-annotated CoT ***,collected CoT rationales often miss the necessary rea-soning steps and specific expertise.

关键词： chain adapting plan-and-solve ps-cot-adapter scienceqa thought

来源：评论

学校读者我要写书评

暂无评论

A Review of Anonymization Algorithms and Methods in Big Data

引用

Annals of Data science 2025年第1期12卷 253-279页

In the era of big data, with the increase in volume and complexity of data, the main challenge is how to use big data while preserving the privacy of users. This study was conducted with the aim of finding a solution to this challenge. In this study, we examined various data anonymization methods, including differential privacy, advanced encryption, and strong access controls. In addition, the operation, advantages, disadvantages, and use of these methods, the challenges of adapting these methods to big data, and possible solutions for them were also examined. Our results show that traditional data anonymization methods lack scalability, leading to privacy breaches and data loss. When faced with large volumes of data, these methods may not be able to fully process the data. Also, these methods may be ineffective against re-identification attacks, linkage attacks, and inference attacks. We introduced emerging methods that are capable of providing improved privacy with minimal data loss. These methods have scalability for big data. Finally, we examined future research works and raised important questions that can help improve existing algorithms or develop new methods, better manage the complexity and scale of unstructured data. © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2024.

关键词： Big data

来源：评论

学校读者我要写书评

暂无评论

Dynamic Modeling and Quantum-Enhanced Forecasting of Multi-Seasonal Energy Prices in Simulated Microgrid Environments

引用

IEEE Access 2025年 13卷 90362-90388页

作者： Dash, Ritesh Sinha, Anupa Jyotheeswara Reddy, K. Dhanamjayulu, C. Kamwa, Innocent Kalinga University Research Scholar Computer Science Engineering Raipur India Kalinga University Computer Science Engineering Raipur India REVA University School of Electrical and Electronics Engineering Bangalore India Vellore Institute of Technology School of Electrical Engineering Vellore632014 India Laval University Department of Electrical Engineering and Computer Engineering Quebec G1V 0A6 Canada

The study address the challenge of forecasting per unit energy prices in a microgrid environment consisting of solar and hydro power resources under multi-seasonal *** deep learning techniques such as LSTM,GRU and ESN often struggle with non-linear dependencies and volatility in energy market. To overcome these we propose a hybrid framework incorporating Adiabatic Quantum Computing (AQC) for electricity price forecasting. The proposed AQC model encodes-32 system and market related variables into quantum states and applies adiabatic evolution to derive optimized price prediction. Simulation results using real microgrid data set-up based on HIL shows that AQC reduces forecasting error by 17.03% compared to LSTM, 14.29% to GRU and 13.88% to ESN over 24-hrs and 48-hrs horizons. The enhanced accuracy and robustness of the quantum assisted model demonstrates its potential for next generation energy market forecasting and decisions making *** entire framework is tested using a synthetic microgrid dataset designed to emulate real-world seasonal and operational dynamics. While this enables controlled validation of the models, the generalizability of the results to real world deployment requires further empirical evaluations on physical microgrid data set. © 2013 IEEE.

关键词： Qubits

来源：评论

学校读者我要写书评

暂无评论

Spectral analysis of bone-conducted speech using modified linear prediction

引用

International Journal of Speech Technology 2024年第4期27卷 1039-1053页

作者： Ohidujjaman Hasan, Mahmudul Zhang, Shiming Huda, Mohammad Nurul Uddin, Mohammad Shorif Computer Science and Engineering Daffodil International University Dhaka1216 Bangladesh Computer Science and Engineering Comilla University Comilla3506 Bangladesh School of Electrical and Information Northeast Agricultural University Harbin150030 China Computer Science and Engineering United International University Dhaka1212 Bangladesh Computer Science and Engineering Green University of Bangladesh Kanchon1460 Bangladesh Computer Science and Engineering Jahangirnagar University Savar1342 Bangladesh

This paper improves the performance of linear prediction (LP) in precise spectral estimation of bone-conducted (BC) speech. Inherently, BC speech contains a wide spectral dynamic range that causes ill conditioning in the autocorrelation (ACR) method and its variants, where the Levinson–Durbin (L–D) algorithm is commonly implemented. Instead of the conventional LP-based spectral estimation methods, we utilize the covariance-based method, specifically the modified covariance (MC) method, where the orthogonal decomposition algorithm is deployed. In this paper, we derive the MC method from the least squares (LS) technique for BC speech analysis. The MC method reduces the eigenvalue expansion that compresses the spectral dynamic range of the BC speech signal. The effect of spectral dynamic range compression declines the ill-conditioned properties of LP. Through the proposed method using synthetic BC speech, the resulting power spectrum provides more accurate peaks than the conventional methods. The validity of the proposed method is also analyzed by inspecting real BC speech. This study reveals the utmost use of BC speech in speech processing systems. The experimental results demonstrate that the proposed method provides more accurate spectral estimation for synthetic and real BC speeches compared with conventional spectral estimation methods. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Autocorrelation

来源：评论

学校读者我要写书评

暂无评论

Enhancing Renewable Energy Integration:A Gaussian-Bare-Bones Levy Cheetah Optimization Approach to Optimal Power Flow in electrical Networks

引用

computer Modeling in engineering & sciences 2024年第8期140卷 1339-1370页

作者： Ali S.Alghamdi Mohamed A.Zohdy Saad Aldoihi Department of Electrical Engineering College of EngineeringMajmaah UniversityAl-Majmaah11952Saudi Arabia Electrical and Computer Engineering Department Oakland UniversityRochesterMI48309USA Institute of Earth and Space Science King Abdulaziz City for Science and Technology(KACST)Riyadh11442Saudi Arabia Computer Science and Systems Engineering Institute Polytechnique de ParisPalaiseau CedexFrance

In the contemporary era,the global expansion of electrical grids is propelled by various renewable energy sources(RESs).Efficient integration of stochastic RESs and optimal power flow(OPF)management are critical for network *** study introduces an innovative solution,the Gaussian Bare-Bones Levy Cheetah Optimizer(GBBLCO),addressing OPF challenges in power generation systems with stochastic *** primary objective is to minimize the total operating costs of RESs,considering four functions:overall operating costs,voltage deviation management,emissions reduction,voltage stability index(VSI)and power loss ***,a carbon tax is included in the objective function to reduce carbon *** scrutiny,using modified IEEE 30-bus and IEEE 118-bus systems,validates GBBLCO’s superior performance in achieving optimal *** results demonstrate GBBLCO’s efficacy in six optimization scenarios:total cost with valve point effects,total cost with emission and carbon tax,total cost with prohibited operating zones,active power loss optimization,voltage deviation optimization and enhancing voltage stability index(VSI).GBBLCO outperforms conventional techniques in each scenario,showcasing rapid convergence and superior solution ***,GBBLCO navigates complexities introduced by valve point effects,adapts to environmental constraints,optimizes costs while considering prohibited operating zones,minimizes active power losses,and optimizes voltage deviation by enhancing the voltage stability index(VSI)*** research significantly contributes to advancing OPF,emphasizing GBBLCO’s improved global search capabilities and ability to address challenges related to local *** emerges as a versatile and robust optimization tool for diverse challenges in power systems,offering a promising solution for the evolving needs of renewable energy-integrated power grids.

关键词： Renewable energy integration optimal power flow stochastic renewable energy sources gaussian-bare-bones levy cheetah optimizer electrical network optimization carbon tax optimization

来源：评论

学校读者我要写书评

暂无评论

Detecting Inaudible Voice Commands via Acoustic Attenuation by Multi-channel Microphones

引用

IEEE Transactions on Dependable and Secure Computing 2024年 1-17页

作者： Ji, Xiaoyu Zhang, Guoming Li, Xinfeng Qu, Gang Cheng, Xiuzhen Xu, Wenyuan Electrical Engineering of Zhejiang University Hangzhou China School of Computer Science and Technology of Shandong University Qingdao China Department of Electrical and Computer Engineering at the University of Maryland College Park MD USA

DolphinAttacks (i.e., inaudible voice commands) modulate audible voices over ultrasounds to inject malicious commands silently into voice assistants and manipulate controlled systems (e.g., doors or smart speakers). Eliminating DolphinAttacks is challenging if ever possible since it requires to modify the microphone hardware. In this paper, we design EarArray, a lightweight method that can not only detect such attacks but also identify the direction of attackers without requiring any extra hardware or hardware modification. Essentially, inaudible voice commands are modulated on ultrasounds that inherently attenuate faster than the one of audible sounds. By inspecting the command sound signals via the built-in multiple microphones on smart devices, EarArray is able to estimate the attenuation rate and thus detect the attacks. We propose a model of the propagation of audible sounds and ultrasounds from the sound source to a voice assistant, e.g., a smart speaker, and illustrate the underlying principle and its feasibility. We implemented EarArray using two specially-designed microphone arrays and our experiments show that EarArray can detect inaudible voice commands with an accuracy of above 99% and recognize the direction of the attackers with an accuracy of 97.89% and can also detect the laser-based attack with an accuracy of 100%. IEEE

关键词： Microphones

来源：评论

学校读者我要写书评

暂无评论

Efficient Saliency Map Detection for Low-Light Images Based on Image Gradient

引用

IEEE Transactions on Circuits and Systems for Video Technology 2024年第2期34卷 852-865页

作者： Lin, Chun-Yi Haq, Muhamad Amirul Chen, Jiun-Han Ruan, Shanq-Jang Naroska, Edwin National Taiwan University of Science and Technology Department of Electronic and Computer Engineering Taipei10607 Taiwan Hochschule Niederrhein University of Applied Science Faculty of Electrical Engineering and Computer Science Krefeld47805 Germany

Recently, deep learning has been widely employed across various domains. The Convolution Neural Network (CNN), a popular deep learning algorithm, has been successfully utilized in object recognition tasks, such as face recognition, vehicle recognition, and license plate recognition. However, conventional methods for object recognition may not be appropriate for low-light image recognition due to information loss in the dark regions and unexpected noise that can impair object quality. Therefore, the development of specialized techniques for low-light image enhancement has become a major research focus for object detection. This paper proposed a gradient-based saliency map detection method with an improved ResNet architecture that outperforms previous works in detecting multiple or large objects. Additionally, the proposed method enhances images with the object as the center and emphasizes foreground-background differences. Compared with previous works, this paper achieves1.28× improvements in the parameters and 1.32× faster inference speed than the original ResNet architecture. © 1991-2012 IEEE.

关键词： Neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：