检索结果-内蒙古大学图书馆

Bridging the language gap: an empirical study of bindings for open source machine learning libraries across software package ecosystems

引用

EMPIRICAL software engineering 2025年第1期30卷 1-31页

作者： Li, Hao Bezemer, Cor-Paul Univ Alberta Analyt Software Games & Repository Data ASGAARD La Edmonton AB Canada

Open source machine learning (ML) libraries enable developers to integrate advanced ML functionality into their own applications. However, popular ML libraries, such as TensorFlow, are not available natively in all programming languages and software package ecosystems. Hence, developers who wish to use an ML library which is not available in their programming language or ecosystem of choice, may need to resort to using a so-called binding library (or binding). Bindings provide support across programming languages and package ecosystems for reusing a host library. For example, the Keras .NET binding provides support for the Keras library in the NuGet (.NET) ecosystem even though the Keras library was written in Python. In this paper, we collect 2,436 cross-ecosystem bindings for 546 ML libraries across 13 software package ecosystems by using an approach called BindFind, which can automatically identify bindings and link them to their host libraries. Furthermore, we conduct an in-depth study of 133 cross-ecosystem bindings and their development for 40 popular open source ML libraries. Our findings reveal that the majority of ML library bindings are maintained by the community, with npm being the most popular ecosystem for these bindings. Our study also indicates that most bindings cover only a limited range of the host library's releases, often experience considerable delays in supporting new releases, and have widespread technical lag. Our findings highlight key factors to consider for developers integrating bindings for ML libraries and open avenues for researchers to further investigate bindings in software package ecosystems.

关键词： software engineering for machine learning machine learning for software engineering software package ecosystems Cross-ecosystem library usage

来源：评论

学校读者我要写书评

暂无评论

Studying the Impact of TensorFlow and PyTorch Bindings on machine learning software Quality

引用

ACM TRANSACTIONS ON software engineering AND METHODOLOGY 2025年第1期34卷 1-31页

作者： Li, Hao Rajbahadur, Gopi krishnan Bezemer, Cor-paul Univ Alberta Analyt Software GAmes & Repository Data ASGAARD La Edmonton AB Canada Huawei Canada Ctr Software Excellence Kingston ON Canada

Bindings for machine learning frameworks (such as TensorFlow and PyTorch) allow developers to integrate a framework's functionality using a programming language different from the framework's default language (usually Python). In this article, we study the impact of using TensorFlow and PyTorch bindings in C#, Rust, Python and JavaScript on the software quality in terms of correctness (training and test accuracy) and time cost (training and inference time) when training and performing inference on five widely used deep learning models. Our experiments show that a model can be trained in one binding and used for inference in another binding for the same framework without losing accuracy. Our study is the first to show that using a non-default binding can help improve machine learning software quality from the time cost perspective compared to the default Python binding while still achieving the same level of correctness.

关键词： software engineering for machine learning software quality deep learning binding TensorFlow PyTorch

来源：评论

学校读者我要写书评

暂无评论

What kinds of contracts do ML APIs need?

引用

EMPIRICAL software engineering 2023年第6期28卷 1-37页

作者： Khairunnesa, Samantha Syeda Ahmed, Shibbir Imtiaz, Sayem Mohammad Rajan, Hridesh Leavens, Gary T. Bradley Univ Dept Comp Sci & Informat Syst Peoria IL 61625 USA Iowa State Univ Dept Comp Sci Ames IA USA Univ Cent Florida Dept Comp Sci Orlando FL USA

Recent work has shown that machine learning (ML) programs are error-prone and called for contracts for ML code. Contracts, as in the design by contract methodology, help document APIs and aid API users in writing correct code. The question is: what kinds of contracts would provide the most help to API users? We are especially interested in what kinds of contracts help API users catch errors at earlier stages in the ML pipeline. We describe an empirical study of posts on Stack Overflow of the four most often-discussed ML libraries: TensorFlow, Scikit-learn, Keras, and PyTorch. For these libraries, our study extracted 413 informal (English) API specifications. We used these specifications to understand the following questions. What are the root causes and effects behind ML contract violations? Are there common patterns of ML contract violations? When does understanding ML contracts require an advanced level of ML software expertise? Could checking contracts at the API level help detect the violations in early ML pipeline stages? Our key findings are that the most commonly needed contracts for ML APIs are either checking constraints on single arguments of an API or on the order of API calls. The software engineering community could employ existing contract mining approaches to mine these contracts to promote an increased understanding of ML APIs. We also noted a need to combine behavioral and temporal contract mining approaches. We report on categories of required ML contracts, which may help designers of contract languages.

关键词： machine learning API contracts Empirical software engineering software engineering for machine learning

来源：评论

学校读者我要写书评

暂无评论

Enhancing Collaboration and Agility in Data-Centric AI Projects 1

引用

18th International Conference on Evaluation of Novel Approaches to software engineering (ENASE)

作者： Stieler, Fabian Baul, Bernhard Univ Augsburg Software Methodol Distributed Syst Augsburg Germany

ISBN: (数字)9783031641824

ISBN: (纸本)9783031641817;9783031641824

Usually, mature Artificial Intelligence (AI) projects are developed by a team of various members, such as data engineers, data scientists, software engineers and machine learning (ML) engineers. They often pursue highly heterogeneous approaches, leading to new challenges in collaboration, particularly regarding software quality, data versioning and the traceability of model metrics and other resulting artifacts. These challenges are further intensified when AI projects rely on dynamic datasets, introducing an entirely new dimension that teams must deal with. Adopting principles from the machine learning operations (MLOps) paradigm becomes essential in this context. To go beyond existing process models and develop actionable guidelines, our work introduces a Git workflow for AI projects. We present basic instructions for data and code while outlining a minimal infrastructure setup. Building upon abstract concepts, we delve into concrete, actionable steps by examining the proposed branching workflow. Through a case study, we apply the development methodology to two use cases and demonstrate that the principles and approaches positively impact project outcomes.

关键词： software engineering for machine learning Agile development MLOps AI development Data-centric AI

来源：评论

学校读者我要写书评

暂无评论

A Meta-Summary of Challenges in Building Products with ML Components - Collecting Experiences from 4758+Practitioners 2

A Meta-Summary of Challenges in Building Products with ML Co...

引用

IEEE/ACM 2nd International Conference on AI engineering - software engineering for AI (CAIN)

作者： Nahar, Nadia Zhang, Haoran Lewis, Grace Zhou, Shurui Kastner, Christian Carnegie Mellon Univ Pittsburgh PA 15213 USA Carnegie Mellon Software Engn Inst Pittsburgh PA 15213 USA Univ Toronto Toronto ON Canada

ISBN: (纸本)9798350301137

Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing ones. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field.

关键词： Meta Summary ML in Production SE4ML SLR software engineering for machine learning

来源：评论

学校读者我要写书评

暂无评论

GitWorkflow for Active learning: A Development Methodology Proposal for Data-Centric AI Projects 18

GitWorkflow for Active Learning: A Development Methodology P...

引用

18th International Conference on Evaluation of Novel Approaches to software engineering (ENASE)

作者： Stieler, Fabian Bauer, Bernhard Univ Augsburg Inst Comp Sci Augsburg Germany

ISBN: (纸本)9789897586477

As soon as Artificial Intelligence (AI) projects grow from small feasibility studies to mature projects, developers and data scientists face new challenges, such as collaboration with other developers, versioning data, or traceability of model metrics and other resulting artifacts. This paper suggests a data-centric AI project with an Active learning (AL) loop from a developer perspective and presents "Git Workflow for AL": A methodology proposal to guide teams on how to structure a project and solve implementation challenges. We introduce principles for data, code, as well as automation, and present a new branching workflow. The evaluation shows that the proposed method is an enabler for fulfilling established best practices.

关键词： Active learning software engineering for machine learning machine learning Operations

来源：评论

学校读者我要写书评

暂无评论

Challenges in machine learning Application Development: An Industrial Experience Report 1

Challenges in Machine Learning Application Development: An I...

引用

1st IEEE/ACM International Workshop on software engineering for Responsible Artificial Intelligence (SE4RAI)

作者： Rahman, Md Saidur Khomh, Foutse Rivera, Emilio Gueheneuc, Yann-Gael Lehnert, Bernd Polytech Montreal Montreal PQ Canada Concordia Univ Montreal PQ Canada SAP Montreal Montreal PQ Canada

ISBN: (纸本)9781450393195

SAP is the market leader in enterprise application software offering an end-to-end suite of applications and services to enable their customers worldwide to operate their business. Especially, retail customers of SAP deal with millions of sales transactions for their day-to-day business. Transactions are created during retail sales at the point of sale (POS) terminals and those transactions are then sent to some central servers for validations and other business operations. A considerable proportion of the retail transactions may have inconsistencies or anomalies due to many technical and human errors. SAP provides an automated process for error detection but still requires a manual process by dedicated employees using workbench software for correction. However, manual corrections of these errors are time-consuming, labor-intensive, and might be prone to further errors due to incorrect modifications. Thus, automated detection and correction of transaction errors are very important regarding their potential business values and the improvement in the business workflow. In this paper, we report on our experience from a project where we develop an AI-based system to automatically detect transaction errors and propose corrections. We identify and discuss the challenges that we faced during this collaborative research and development project, from two distinct perspectives: software engineering and machine learning. We report on our experience and insights from the project with guidelines for the identified challenges. We collect developers' feedback for qualitative analysis of our findings. We believe that our findings and recommendations can help other researchers and practitioners embarking into similar endeavours.

关键词： software engineering for machine learning Error Detection and Correction Challenges and Best Practices

来源：评论

学校读者我要写书评

暂无评论

Preliminary Literature Review of machine learning System Development Practices 45

Preliminary Literature Review of Machine Learning System Dev...

引用

45th Annual International IEEE-Computer-Society Computers, software, and Applications Conference (COMPSAC)

作者： Watanabe, Yasuhiro Washizaki, Hironori Sakamoto, Kazunori Saito, Daisuke Honda, Kiyoshi Tsuda, Naohiko Fukazawa, Yoshiaki Yoshioka, Nobukazu Waseda Univ Tokyo Japan Osaka Inst Technol Osaka Japan

ISBN: (纸本)9781665424639

To guide practitioners and researchers to design and research machine learning (ML) system development processes, we conduct a preliminary literature review on ML system development practices. We identified seven papers and two other papers determined in an ad-hoc review. Our findings include emphasized phases in ML system developments, frequently described ML-specific practices, and tailored traditional practices.

关键词： software engineering for machine learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：