检索结果-内蒙古大学图书馆

How far are we to GPT-4V?Closing the gap to commercial multimodal models with open-source suites

science China(Information sciences) 2024年第12期67卷 5-22页

作者： Zhe CHEN Weiyun WANG Hao TIAN Shenglong YE Zhangwei GAO Erfei CUI Wenwen TONG Kongzhi HU Jiapeng LUO Zheng MA Ji MA Jiaqi WANG Xiaoyi DONG Hang YAN Hewei GUO Conghui HE Botian SHI Zhenjiang JIN Chao XU Bin WANG Xingjian WEI Wei LI Wenjian ZHANG Bo ZHANG Pinlong CAI Licheng WEN Xiangchao YAN Min DOU Lewei LU Xizhou ZHU Tong LU Dahua LIN Yu QIAO Jifeng DAI Wenhai WANG State Key Laboratory for Novel Software Technology Nanjing University Shanghai AI Laboratory School of Computer Science Fudan University SenseTime Research Department of Information Engineering The Chinese University of Hong Kong Department of Electronic Engineering Tsinghua University

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model(MLLM) to bridge the capability gap between open-source and proprietary commercial models in multimodal understanding. We introduce three simple improvements.(1) Strong vision encoder: we explored a continuous learning strategy for the large-scale vision foundation model — InternViT-6B, boosting its visual understanding capabilities, and making it can be transferred and reused in different LLMs.(2) Dynamic high-resolution: we divide images into tiles ranging from 1 to 40 of 448×448 pixels according to the aspect ratio and resolution of the input images, which supports up to 4K resolution input.(3) High-quality bilingual dataset: we carefully collected a high-quality bilingual dataset that covers common scenes, document images,and annotated them with English and Chinese question-answer pairs, significantly enhancing performance in optical character recognition(OCR) and Chinese-related tasks. We evaluate InternVL 1.5 through a series of benchmarks and comparative studies. Compared to both open-source and proprietary commercial models, InternVL 1.5 shows competitive performance, achieving state-of-the-art results in 8 of 18 multimodal benchmarks. Code and models are available at https://***/OpenGVLab/InternVL.

关键词： multimodal model open-source vision encoder dynamic resolution bilingual dataset

来源：评论

学校读者我要写书评

暂无评论

GliomaCNN: An Effective Lightweight CNN Model in Assessment of Classifying Brain Tumor from Magnetic Resonance Images Using Explainable AI

引用

computer Modeling in engineering & sciences 2024年第9期140卷 2425-2448页

作者： Md.Atiqur Rahman Mustavi Ibne Masum Khan Md Hasib M.F.Mridha Sultan Alfarhood Mejdl Safran Dunren Che Department of Computer Science and Engineering Ahsanullah University of Science and TechnologyDhaka1208Bangladesh Department of Computer Science and Software Engineering The University of Western AustraliaPerthWA 6009Australia Department of Computer Science American International University-BangladeshDhaka1229Bangladesh Department of Computer Science College of Computer and Information SciencesKing Saud UniversityP.O.Box 51178Riyadh11543Saudi Arabia School of Computing Southern Illinois UniversityCarbondale62901USA

Brain tumors pose a significant threat to human lives and have gained increasing attention as the tenth leading cause of global *** study addresses the pressing issue of brain tumor classification using Magnetic resonance imaging(MRI).It focuses on distinguishing between Low-Grade Gliomas(LGG)and High-Grade Gliomas(HGG).LGGs are benign and typically manageable with surgical resection,while HGGs are malignant and more *** research introduces an innovative custom convolutional neural network(CNN)model,*** stands out as a lightweight CNN model compared to its *** research utilized the BraTS 2020 dataset for its *** with the gradient-boosting algorithm,GliomaCNN has achieved an impressive accuracy of 99.1569%.The model’s interpretability is ensured through SHapley Additive exPlanations(SHAP)and Gradient-weighted Class Activation Mapping(Grad-CAM++).They provide insights into critical decision-making regions for classification *** challenges in identifying tumors in images without visible signs,the model demonstrates remarkable performance in this critical medical application,offering a promising tool for accurate brain tumor diagnosis which paves the way for enhanced early detection and treatment of brain tumors.

关键词： Deep learning magnetic resonance imaging convolutional neural networks explainable AI boosting algorithm ablation

来源：评论

学校读者我要写书评

暂无评论

Deadline-aware Load Allocation for Coded Computation over Heterogeneous Clusters 5

Deadline-aware Load Allocation for Coded Computation over He...

引用

5th International Conference on Big Data and Artificial Intelligence and software engineering, ICBASE 2024

作者： Cheng, Shiying Lin, Yuxuan Tang, Bin Hohai University College of Computer Science and Software Engineering Nanjing China

ISBN: (纸本)9798331506612

In large-scale distributed systems, the performance of computation tasks is often significantly degraded by straggling nodes. Recently, coded computation has emerged as a promising approach to mitigate the effect of stragglers. However, the performance of coded computation is still significantly affected by node workloads, particularly in heterogeneous clusters. Some load allocation schemes have been proposed to minimize the expected computation latency, but for time-sensitive tasks with strict deadlines, they often result in high failure probabilities. In this paper, we examine the fundamental matrix-vector multiplication tasks and focus on the deadline-aware load allocation problem under two typical runtime models, aiming to minimize the task failure probability. We firstly propose a simple yet effective normal approximation method based on the central limit theorem to approximate the failure probability, transforming the problem into a non-convex multivariate optimization, and then present an efficient iterative load allocation algorithm. Extensive simulations demonstrate the effectiveness of our proposed scheme. © 2024 IEEE.

关键词： Convex optimization

来源：评论

学校读者我要写书评

暂无评论

software approaches for resilience of high performance computing systems:a survey

引用

Frontiers of computer science 2023年第4期17卷 43-56页

作者： Jie JIA Yi LIU Guozhen ZHANG Yulin GAO Depei QIAN School of Computer Science and Engineering Beihang UniversityBeijing 100191China Sino-German Joint Software Institute Beihang UniversityBeijing 100191China

With the scaling up of high-performance computing systems in recent years,their reliability has been descending ***,system resilience has been regarded as one of the critical challenges for large-scale HPC *** techniques and systems have been proposed to ensure the correct execution and completion of parallel *** paper provides a comprehensive survey of existing software resilience ***,a classification of software resilience approaches is presented;then we introduce major approaches and techniques,including checkpointing,replication,soft error resilience,algorithmbased fault tolerance,fault detection and *** addition,challenges exposed by system-scale and heterogeneous architecture are also discussed.

关键词： resilience high-performance computing fault tolerance challenge

来源：评论

学校读者我要写书评

暂无评论

Reducing Age of Collection with Dynamic-Frame Time Division Multiple Access 100

Reducing Age of Collection with Dynamic-Frame Time Division ...

引用

100th IEEE Vehicular Technology Conference, VTC 2024-Fall

作者： Lai, Yurong Han, Xinhui Wang, Xueer Pan, Haoyuan Shenzhen University College of Computer Science and Software Engineering Shenzhen China

ISBN: (纸本)9798331517786

This paper introduces a dynamic-frame time division multiple access (DF-TDMA) scheme aimed at decreasing the age of collection (AoC) in collaborative monitoring scenarios. Unlike the conventional age of information (AoI) metric, AoC decreases only when partial information from multiple sources is aggregated to form a complete observation. Previous studies on AoC predominantly assumed that complete observations are derived from aggregating information from all sources. However, this assumption does not hold in scenarios where sources are correlated, and a complete observation can be achieved with partial information from only a subset of sources. To address this, new channel access protocols are necessary to achieve a low network-wide average peak AoC. DF-TDMA is proposed as a solution, featuring dynamically adjustable TDMA frame sizes to minimize time wastage and thereby reduce AoC. While the dynamic frame size adjustment complicates AoC analysis, we theoretically derive the average peak AoC of DF-TDMA. Simulations show that DF-TDMA maintains a stable average peak AoC across varying numbers of sources N, and notably reduces the average peak AoC compared to a fixed-frame TDMA scheme, particularly under conditions with a large N. © 2024 IEEE.

关键词： Time division multiple access

来源：评论

学校读者我要写书评

暂无评论

A Semi-Supervised image classification algorithm inspired by the Primacy Effect 4

A Semi-Supervised image classification algorithm inspired by...

引用

4th International Conference on Digital Signal and computer Communications, DSCC 2024

作者： Zhao, Dianqing Li, Chaofan Zhu, Anmin College of Computer Science and Software Engineering Shenzhen University Shenzhen China

ISBN: (纸本)9781510681538

Semi-supervised-Learning(SSL) providing a solution to leverage vast amounts of unlabeled data. In cognitive psychology, the Primacy-effect refers to the phenomenon where the initial information encountered tends to leave a deeper impression in human cognitive processes, serving as the basis for subsequent judgments. Inspired by the Primacy-effect, this paper proposes a novel semi-supervised image classification algorithm. The core idea of this algorithm is to mimic the beneficial effects of the Primacy effect on human cognitive processes, simulate similar phenomenon on artificial neural networks. In this paper, the initially labeled data is referred to as exemplar. The algorithm includes an exemplar prediction module, whose main function is to accurately identify examples, ensuring that the model forms a "deep impression" of them. We found that due to the scarcity of examples, it is easy to cause model overfitting. Therefore, we proposes the Weighted-Gradient-Chain technique. Additionally, Pseudo-labeling technique was employed, but during model training, we found that generated erroneous Pseudo-labels could introduce errors. To enhance the quality of Pseudo-label generation, this paper proposes a Pre-Pseudo-labeling method. A series of experiments were conducted on multiple datasets. The results indicate that the proposed model performed well. © 2024 SPIE.

关键词： Image classification

来源：评论

学校读者我要写书评

暂无评论

Detecting and Mitigating the Weakest Cybersecurity Link in an Information System 23

Detecting and Mitigating the Weakest Cybersecurity Link in a...

引用

23rd International Conference on New Trends in Intelligent software Methodologies, Tools and Techniques, SoMeT 2024

作者： Sadeghian, Ali Mejri, Mohamed Department of Computer Science and Software Engineering Laval University Quebec Canada

ISBN: (纸本)9781643685380

In today's digital landscape, the prevention of cyber attacks has become exceptionally crucial. This is especially true for safety-critical systems, where safeguarding against these threats is of paramount importance. To address this concern, the MITRE Corporation has developed ATT&CK, an extensive framework comprising data matrices. This framework serves the purpose of assessing a company's security preparedness and pinpointing vulnerabilities that may exist within its infrastructure. By leveraging the capabilities of MITRE ATT&CK, including its tactics and techniques, in conjunction with the LDA4CPS tool, we have devised a novel approach to identify the most critical vulnerabilities in a susceptible system. Furthermore, incorporating MITRE ATT&CK mitigations tailored to the discovered vulnerability empowers the blue team (defensive side) with tangible, practical measures to fortify their security posture. This approach enhances their capability to effectively counter cyber threats, bolstering their overall defensive capabilities. © 2024 IOS Press. All rights reserved.

关键词： Cyber attacks

来源：评论

学校读者我要写书评

暂无评论

Surtify: A Smart Surface Identification System Based on Multi-dimensional Acoustic Dispersion

Surtify: A Smart Surface Identification System Based on Mult...

引用

2024 International Conference on Artificial Intelligence of Things and Systems, AIoTSys 2024

作者： Wang, Yunshu Yuan, Baojie Dong, Feihang Hong, Shicong Zou, Yongpan College of Computer Science and Software Engineering Shenzhen University Shenzhen China

ISBN: (纸本)9798331507886

With the continuous advancement of the smart home market, household items have become more intelligent. By identifying the different material attributes of various household tabletops, we can obtain contextual information and engage in further interactions. This paper explores a phenomenon, acoustic dispersion and applies it to surface identification. Integrating mobile deep learning, we propose a method to recognize solid surface materials through tapping and utilize tapping events as triggers for interaction by identifying surface material categories. To achieve the goal of zero training effort and tiny hardware cost for unseen users, we analyze the propagation patterns of different frequency components in tapping sound signals from both temporal and spatial dimensions. Based on this, we propose a time-frequency data augmentation strategy that combines multiple event points to improve the recognition accuracy. The results show that, using the leave-one-user-out cross validation for 24 volunteers trained with simple guidance on tapping intensity and position, our system achieves a recognition accuracy of 98.6% for 11 common material types. For 11 newly recruited volunteers without specific control over force and position, the accuracy also remains at 93.4%. We also conduct a series of cross-environment experiments and user study, all of which indicate the high potential of our system. © 2024 IEEE.

关键词： Smart homes

来源：评论

学校读者我要写书评

暂无评论

Systematic analysis of on-premise and cloud services

引用

International Journal of Cloud Computing 2024年第3期13卷 214-242页

作者： Ali, Asif Laghari, Asif Ali Kandhro, Irfan Ahmed Kumar, Kamlesh Younus, Salman Department of Computer Science Sindh Madressatul Islam University Karachi Pakistan Department of Software Engineering Sindh Madressatul Islam University Karachi Pakistan

There are two key distinctions between cloud and on-premise (OP) software, the cost for each varies and so does the level of control. As organisations explore to reduce costs, many data and rules are migrating to multiple clouds like GCP, AWS, AZURE, and so on. Cloud service providers provide the extensibility, resilience, and quickness that traditional OP deployments usually lose. This paper presents a comparative analysis of on-premise and cloud computing differs by outline, arrangement, administration, and devices for associations and clients. This comparison shows that cloud computing provides more flexible infrastructure and better service of data processing on-premise. In today’s enterprise IT world, there are many factors that a business must consider when deciding whether a cloud infrastructure is the right choice. As there are many mid-level organisations that I personally know which fails to adapt cloud solutions, instead rely on their proven legacy and on-premises applications and software to do business. Therefore, it is very challenging to decide for business, if either a cloud infrastructure is right choice or not. In this article, which have covered many areas and analyses of on-premise and cloud, which will help mid-level organisations to make the decision of choosing cloud solutions for their running business applications. Copyright © 2024 Inderscience Enterprises Ltd.

关键词： Cloud computing

来源：评论

学校读者我要写书评

暂无评论

A Ginkgo Detection Algorithm in Complex Environments Based on Improved YOLOv7 9

A Ginkgo Detection Algorithm in Complex Environments Based o...

引用

9th International Conference on Intelligent Computing and Signal Processing, ICSP 2024

作者： Ma, Chi Guo, Qiang School of Computer Science and Engineering Huizhou University Huizhou China School of Computer Science and Software Engineering University of Science and Technology LiaoNing Anshan China

ISBN: (纸本)9798350376548

Detections of Ginkgoes are prerequisites for later counting and harvesting. Due to the uneven distribution of samples, the detection speed and accuracy of existing algorithms cannot adapt to the impact of complex environments. Therefore, an enhanced model YOLOv7-DC based on YOLOv7 is proposed in this paper, which redesigns the detection network and introduces a new feature fusion method. DCNv2 is embedded in the efficient layer aggregation network (ELAN), while PConv is utilized instead of conventional convolution to reduce the parameter impact of DCNv2. Moreover, the attention mechanism CBAM is introduced during training to enhance spatial and channel information, and the ConvMixer architecture is employed to capture spatial and channel relationships within the features, which are transmitted to the detection head through attention mechanism, improving the model's detection accuracy for each specific classification sample. Experimental results show that our YOLOv7-DC achieves both excellent detection speed and recognition rate in various classification tasks. The improved model's average detection accuracy is increased by 6.2% compared to previous algorithms, and the model parameters are reduced by 13%. It is proved that YOLOv7-DC is more suitable for scenarios with imbalanced samples and complex environments. © 2024 IEEE.

关键词： Classification (of information)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：