检索结果-内蒙古大学图书馆

TechRxiv 2025年

作者： Hadi, Muhammad Usman Al-Tashi, Qasem Qureshi, Rizwan Shah, Abbas Muneer, Amgad Irfan, Muhammad Zafar, Anas Shaikh, Muhammad Bilal Akhtar, Naveed Al-Garadi, Mohammed Ali Hassan, Syed Zohaib Shoman, Maged Wu, Jia Mirjalili, Seyedali Shah, Mubarak School of Engineering Ulster University BelfastBT15 1AP United Kingdom Department of Imaging Physics The University of Texas MD Anderson Cancer Center HoustonTX77030 United States Department of Electronics Engineering Mehran University of Engineering and Technology Jamshoro76062 Pakistan of Engineering Sciences and Technology Swabi23460 Pakistan Department of Computer Science National University of Computer and Emerging Sciences Karachi Pakistan Edith Cowan University 270 Joondalup Drive Joondalup PerthWA6027 Australia Computing and Information Systems The University of Melbourne 700 Swanston Street CarltonVIC3010 Australia Department of Biomedical Informatics Vanderbilt University Medical Center NashvilleTN United States Department of Civil Environmental and Construction Engineering The University of Central Florida OrlandoFL United States Centre for Artificial Intelligence Research and Optimization Torrens University Australia Fortitude Valley BrisbaneQLD4006 Australia University Research and Innovation Center Obuda University Budapest1034 Hungary Center for Research in Computer Vision The University of Central Florida OrlandoFL United States

Within the vast expanse of computerized language processing, a revolutionary entity known as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to comprehend intricate linguistic patterns and conjure coherent and contextually fitting responses. LLMs are a type of artificial intelligence (AI) that have emerged as powerful tools for a wide range of tasks, including natural language processing (NLP), machine translation, vision applications, and question-answering. This survey provides a comprehensive overview of LLMs, including their history, architecture, datasets, training methods, applications, challenges, and future prospects. We begin by discussing the fundamental concepts of generative AI and the architecture of generative pre-trained transformers (GPT). We then provide an overview of the history of LLMs, their evolution over time, and the different training methods. We also present benchmark dataset for training and fine-tuning and evaluating LLMs. We then discuss the wide range of tasks where they are used and also discuss applications of LLMs in different domains, including medicine, education, finance, engineering, agriculture, media, entertainment, politics, and law. We also discuss how LLMs are shaping the future of AI and their increasing role in scientific discovery, and how they can be used to solve real-world problems. Next, we explore the challenges associated with deploying LLMs in real-world scenarios, including ethical considerations, model biases, interpretability, privacy concerns, and computational resource requirements. This survey also highlights techniques for enhancing the robustness and controllability of LLMs and addressing bias, fairness, and quality issues in Generative AI. Finally, we conclude by highlighting the future of LLM research and the challenges that need to be addressed in order to make this technology more reliable and useful. This survey is intended to provide researchers, practitioners, and ent

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Transductive zero-shot learning by decoupled feature generation

arXiv

引用

arXiv 2021年

作者： Marmoreo, Federico Cavazza, Jacopo Murino, Vittorio Pattern Analysis and Computer Vision Istituto Italiano di Tecnologia Italy University of Genova Italy Huawei Technologies Ltd. Ireland Research Center Ireland Department of Computer Science University of Verona Italy

In this paper, we address zero-shot learning (ZSL), the problem of recognizing categories for which no labeled visual data are available during training. We focus on the transductive setting, in which unlabelled visual data from unseen classes is available. State-of-the-art paradigms in ZSL typically exploit generative adversarial networks to synthesize visual features from semantic attributes. We posit that the main limitation of these approaches is to adopt a single model to face two problems: 1) generating realistic visual features, and 2) translating semantic attributes into visual cues. Differently, we propose to decouple such tasks, solving them separately. In particular, we train an unconditional generator to solely capture the complexity of the distribution of visual data and we subsequently pair it with a conditional generator devoted to enrich the prior knowledge of the data distribution with the semantic content of the class embeddings. We present a detailed ablation study to dissect the effect of our proposed decoupling approach, while demonstrating its superiority over the related state-of-the-art. Copyright © 2021, The Authors. All rights reserved.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Towards Cross-device and Training-free Robotic Grasping in 3D Open World

arXiv

引用

arXiv 2024年

作者： Zhao, Weiguang Jiang, Chenru Zhang, Chengrui Sun, Jie Yan, Yuyao Zhang, Rui Huang, Kaizhu Department of Computer Science University of Liverpool LiverpoolL69 7ZX United Kingdom Data Science Research Center Duke Kunshan University Suzhou215316 China Department of Electrical Engineering University of Liverpool LiverpoolL69 7ZX United Kingdom Department of Mechatronics and Robotics Xi’an-Jiaotong Liverpool University Suzhou215123 China School of Robotic Xi’an Jiaotong-Liverpool University Suzhou215123 China Department of Foundational Mathematics Xi’an Jiaotong-Liverpool University Suzhou215123 China

— Robotic grasping in the open world is a critical component of manufacturing and automation processes. While numerous existing approaches depend on 2D segmentation output to facilitate the grasping procedure, accurately determining depth from 2D imagery remains a challenge, often leading to limited performance in complex stacking scenarios. In contrast, techniques utilizing 3D point cloud data inherently capture depth information, thus enabling adeptly navigating and manipulating a diverse range of complex stacking scenes. However, such efforts are considerably hindered by the variance in data capture devices and the unstructured nature of the data, which limits their generalizability. Consequently, much research is narrowly concentrated on managing designated objects within specific settings, which confines their real-world applicability. This paper presents a novel pipeline capable of executing object grasping tasks in open-world scenarios even on previously unseen objects without the necessity for training. Additionally, our pipeline supports the flexible use of different 3D point cloud segmentation models across a variety of scenes. Leveraging the segmentation results, we propose to engage a training-free binary clustering algorithm that not only improves segmentation precision but also possesses the capability to cluster and localize unseen objects for executing grasping operations. In our experiments, we investigate a range of open-world scenarios, and the outcomes underscore the remarkable robustness and generalizability of our pipeline, consistent across various environments, robots, cameras, and objects. The code will be made available upon acceptance of the paper. © 2024, CC BY-NC-ND.

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

An efficient artificial intelligence approach for early detection of cross-site scripting attacks

引用

Decision Analytics Journal 2024年 11卷

作者： Younas, Faizan Raza, Ali Thalji, Nisrean Abualigah, Laith Zitar, Raed Abu Jia, Heming Department of Computer Science & Information Technology The University Of Lahore Lahore 54000 Pakistan Department of Software Engineering The University Of Lahore Lahore 54000 Pakistan Department of Robotics and Artificial Intelligence Jadara University Irbid Jordan Hourani Center for Applied Scientific Research Al-Ahliyya Amman University Amman 19328 Jordan Computer Science Department Al al-Bayt University Mafraq 25113 Jordan Artificial Intelligence and Sensing Technologies (AIST) Research Center University of Tabuk Tabuk 71491 Saudi Arabia MEU Research Unit Middle East University Amman 11831 Jordan Applied science research center Applied science private university Amman 11931 Jordan School of Engineering and Technology Sunway University Malaysia Petaling Jaya 27500 Malaysia Sorbonne Center of Artificial Intelligence Sorbonne University-Abu Dhabi Abu Dhabi 38044 United Arab Emirates School of Information Engineering Sanming University Sanming 365004 China

Cross-Site Scripting (XSS) attacks continue to pose a significant threat to web applications, compromising the security and integrity of user data. XSS is a web application vulnerability where malicious scripts are injected into websites, allowing attackers to execute arbitrary code in the victim's browser. The consequences of XSS attacks can be severe, ranging from financial losses to compromising sensitive user information. XSS attacks enable attackers to deface websites, distribute malware, or launch phishing campaigns, compromising the trust and reputation of affected organizations. This study proposes an efficient artificial intelligence approach for the early detection of XSS attacks, utilizing machine learning and deep learning approaches, including Long Short-Term Memory (LSTM). Additionally, advanced feature engineering techniques, such as the Term Frequency-Inverse Document Frequency (TFIDF), are applied and compared to evaluate results. We introduce a novel approach named LSTM-TFIDF (LSTF) for feature extraction, which combines temporal and TFIDF features from the cross-site scripting dataset, resulting in a new feature set. Extensive research experiments demonstrate that the random forest method achieved a high performance of 0.99, outperforming state-of-the-art approaches using the proposed features. A k-fold cross-validation mechanism is utilized to validate the performance of applied methods, and hyperparameter tuning further enhances the performance of XSS attack detection. We have applied Explainable Artificial Intelligence (XAI) to understand the interpretability and transparency of the proposed model in detecting XSS attacks. This study makes a valuable contribution to the growing body of knowledge on XSS attacks and provides an efficient model for developers and security practitioners to enhance the security of web applications. © 2024 The Author(s)

关键词： Artificial intelligence Cross-site scripting attacks Deep learning Feature engineering Feature fusion Machine learning

来源：评论

学校读者我要写书评

暂无评论

LLM Post-Training: A Deep Dive into Reasoning Large Language Models

arXiv

引用

arXiv 2025年

作者： Kumar, Komal Ashraf, Tajamul Thawakar, Omkar Anwer, Rao Muhammad Cholakkal, Hisham Shah, Mubarak Yang, Ming-Hsuan Torr, Phillip H.S. Khan, Fahad Shahbaz Khan, Salman Mohamed bin Zayed University of Artificial Intelligence Abu Dhabi United Arab Emirates Center for Research in Computer Vision The University of Central Florida OrlandoFL32816 United States University of California at Merced MercedCA95343 United States Google DeepMind Mountain ViewCA94043 United States Department of Engineering Science University of Oxford OxfordOX1 2JD United Kingdom

Large Language Models (LLMs) have transformed the natural language processing landscape and brought to life diverse applications. Pretraining on vast web-scale data has laid the foundation for these models, yet the research community is now increasingly shifting focus toward post-training techniques to achieve further breakthroughs. While pretraining provides a broad linguistic foundation, post-training methods enable LLMs to refine their knowledge, improve reasoning, enhance factual accuracy, and align more effectively with user intents and ethical considerations. Fine-tuning, reinforcement learning, and test-time scaling have emerged as critical strategies for optimizing LLMs performance, ensuring robustness, and improving adaptability across various real-world tasks. This survey provides a systematic exploration of post-training methodologies, analyzing their role in refining LLMs beyond pretraining, addressing key challenges such as catastrophic forgetting, reward hacking, and inference-time trade-offs. We highlight emerging directions in model alignment, scalable adaptation, and inference-time reasoning, and outline future research directions. We also provide a public repository to continually track developments in this fast-evolving field: https://***/mbzuai-oryx/Awesome-LLM-Post-training. © 2025, CC BY.

关键词： Deep reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Applications of Transformer Attention Mechanisms in Information Security: Current Trends and Prospects

Applications of Transformer Attention Mechanisms in Informat...

引用

2022 International Conference on Artificial Intelligence of Things and Crowdsensing (AIoTCs)

作者： M. Vubangsi Sarumi Usman Abidemi Olukayode Akanni Auwalu Saleh Mubarak Fadi Al-Turjman Artificial Intelligence Engineering Dept. AI and Robotics Institute Near East University Mersin Turkey Department of Computer Science HTTTC Bambili Computational materials Science lab University of Bamenda NWR Cameroon Research Center for AI and IoT Faculty of Engineering University of Kyrenia Mersin Turkey

In this work, we present a comprehensive survey on applications of the most recent transformer architecture based on attention in information security. Our review reveals three primary areas of application: Intrusion detection, Anomaly Detection and Malware Detection. We have presented an overview of attention-based mechanisms and their application in each cybersecurity use case, and discussed open grounds for future trends in Artificial Intelligence enabled information security.

关键词： Neural networks Intrusion detection Transformers Market research Real-time systems Malware Internet of Things

来源：评论

学校读者我要写书评

暂无评论

Incentive Mechanism Design for Unbiased Federated Learning with Randomized Client Participation

Incentive Mechanism Design for Unbiased Federated Learning w...

引用

International Conference on Distributed Computing Systems

作者： Bing Luo Yutong Feng Shiqiang Wang Jianwei Huang Leandros Tassiulas Electrical and Computer Engineering Division of Natural and Applied Sciences Duke Kunshan University Kunshan China School of Science and Engineering The Chinese University of Hong Kong Shenzhen China IBM T. J. Watson Research Center NY USA Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China Department of Electrical Engineering Institute for Network Science Yale University USA

Incentive mechanism is crucial for federated learning (FL) when rational clients do not have the same interests in the global model as the server. However, due to system heterogeneity and limited budget, it is generally impractical for the server to incentivize all clients to participate in all training rounds (known as full participation). The existing FL incentive mechanisms are typically designed by stimulating a fixed subset of clients based on their data quantity or system resources. Hence, FL is performed only using this subset of clients throughout the entire training process, leading to a biased model because of data heterogeneity. This paper proposes a game-theoretic incentive mechanism for FL with randomized client participation, where the server adopts a customized pricing strategy that motivates different clients to join with different participation levels (probabilities) for obtaining an unbiased and high-performance model. Each client responds to the server's monetary incentive by choosing its best participation level, to maximize its profit based on not only the incurred local cost but also its intrinsic value for the global model. To effectively evaluate clients' contribution to the model performance, we derive a new convergence bound which analytically predicts how clients' arbitrary participation levels and their heterogeneous data affect the model performance. By solving a non-convex optimization problem, our analysis reveals that the intrinsic value leads to the interesting possibility of bi-directional payment between the server and clients. Experimental results using real datasets on a hardware prototype demonstrate the superiority of our mechanism in achieving higher model performance for the server as well as higher profits for the clients.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Advances in adversarial attacks and defenses in computer vision: A survey

arXiv

引用

arXiv 2021年

作者： Akhtar, Naveed Mian, Ajmal Kardan, Navid Shah, Mubarak Department of Computer Science and Software Engineering University of Western Australia 35 Stirling Highway CrawleyWA6009 Australia Center for Research in Computer Vision University of Central Florida OrlandoFL32816 United States

—Deep Learning (DL) is the most widely used tool in the contemporary field of computer vision. Its ability to accurately solve complex problems is employed in vision research to learn deep neural models for a variety of tasks, including security critical applications. However, it is now known that DL is vulnerable to adversarial attacks that can manipulate its predictions by introducing visually imperceptible perturbations in images and videos. Since the discovery of this phenomenon in 2013 [1], it has attracted significant attention of researchers from multiple sub-fields of machine intelligence. In [2], we reviewed the contributions made by the computer vision community in adversarial attacks on deep learning (and their defenses) until the advent of year 2018. Many of those contributions have inspired new directions in this area, which has matured significantly since witnessing the first generation methods. Hence, as a legacy sequel of [2], this literature review focuses on the advances in this area since 2018. To ensure authenticity, we mainly consider peer-reviewed contributions published in the prestigious sources of computer vision and machine learning research. Besides a comprehensive literature review, the article also provides concise definitions of technical terminologies for non-experts in this domain. Finally, this article discusses challenges and future outlook of this direction based on the literature reviewed herein and [2]. Copyright © 2021, The Authors. All rights reserved.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for vision, Language, and Multimodal Tasks

arXiv

引用

arXiv 2023年

作者： Zhang, Kai Zhou, Rong Adhikarla, Eashan Yan, Zhiling Liu, Yixin Yu, Jun Liu, Zhengliang Chen, Xun Davison, Brian D. Ren, Hui Huang, Jing Chen, Chen Zhou, Yuyin Fu, Sunyang Liu, Wei Liu, Tianming Li, Xiang Chen, Yong He, Lifang Zou, James Li, Quanzheng Liu, Hongfang Sun, Lichao Department of Computer Science and Engineering Lehigh University PA United States School of Computing University of Georgia GA United States Samsung Research America CA United States Department of Radiology Massachusetts General Hospital Harvard Medical School MA United States Department of Biostatistics Epidemiology and Informatics University of Pennsylvania PA United States PolicyLab Children’s Hospital of Philadelphia PA United States Center for Research in Computer Vision University of Central Florida FL United States Department of Computer Science and Engineering University of California Santa CruzCA United States McWilliams School of Biomedical Informatics UTHealth HoustonTX United States Department of Radiation Oncology Mayo Clinic AZ United States University of Pennsylvania PA United States PA United States Leonard Davis Institute of Health Economics PA United States Department of Biomedical Data Science Stanford University School of Medicine CA United States Department of Computer Science Stanford University CA United States

Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners, and patients. Here, we propose BiomedGPT, the first open-source and lightweight vision-language foundation model, designed as a generalist capable of performing various biomedical tasks. BiomedGPT achieved state-of-the-art results in 16 out of 25 experiments while maintaining a computing-friendly model scale. We also conducted human evaluations to assess the capabilities of BiomedGPT in radiology visual question answering, report generation, and summarization. BiomedGPT exhibits robust prediction ability with a low error rate of 3.8% in question answering, satisfactory performance with an error rate of 8.3% in writing complex radiology reports, and competitive summarization ability with a nearly equivalent preference score to human experts. Our method demonstrates that effective training with diverse data can lead to more practical biomedical AI for improving diagnosis and workflow efficiency. © 2023, CC BY.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

Towards hierarchical task decomposition using deep reinforcement learning for pick and place subtasks

arXiv

引用

arXiv 2021年

作者： Marzari, Luca Pore, Ameya Dall'Alba, Diego Aragon-Camarasa, Gerardo Farinelli, Alessandro Fiorini, Paolo Department of Computer Science University of Verona Verona Italy Center of Research in Biomedical Engineering Universitat Politècnica de Catalunya Barcelona Spain Computer Vision and Autonomous Group School of Computing Science University of Glasgow Glasgow United Kingdom

Deep Reinforcement Learning (DRL) is emerging as a promising approach to generate adaptive behaviors for robotic platforms. However, a major drawback of using DRL is the data-hungry training regime that requires millions of trial and error attempts, which is impractical when running experiments on robotic systems. Learning from Demonstrations (LfD) has been introduced to solve this issue by cloning the behavior of expert demonstrations. However, LfD requires a large number of demonstrations that are difficult to be acquired since dedicated complex setups are required. To overcome these limitations, we propose a multi-subtask reinforcement learning methodology where complex pick and place tasks can be decomposed into low-level subtasks. These subtasks are parametrized as expert networks and learned via DRL methods. Trained subtasks are then combined by a high-level choreographer to accomplish the intended pick and place task considering different initial configurations. As a testbed, we use a pick and place robotic simulator to demonstrate our methodology and show that our method outperforms a benchmark methodology based on LfD in terms of sample-efficiency. We transfer the learned policy to the real robotic system and demonstrate robust grasping using various geometric-shaped objects. Copyright © 2021, The Authors. All rights reserved.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：