检索结果-内蒙古大学图书馆

Geochemistry π: Automated Machine Learning python framework for Tabular Data

GEOCHEMISTRY GEOPHYSICS GEOSYSTEMS 2024年第1期25卷 e2023GC011324-e2023GC011324页

作者： ZhangZhou, J. He, Can Sun, Jianhao Zhao, Jianming Lyu, Yang Wang, Shengxin Zhao, Wenyu Li, Anzhou Ji, Xiaohui Agarwal, Anant Zhejiang Univ Sch Earth Sci Key Lab Geosci Big Data & Deep Resource Zhejiang P Hangzhou Peoples R China Natl Univ Singapore Sch Comp Singapore Singapore China Univ Geosci Sch Earth Sci Wuhan Peoples R China Lanzhou Univ Sch Earth Sci Lanzhou Peoples R China China Univ Geosci Sch Informat Engn Beijing Peoples R China Nissan Motor Corp Dept Data Sci Yokohama Japan

Although machine learning (ML) has brought new insights into geochemistry research, its implementation is laborious and time-consuming. Here, we announce Geochemistry pi, an open-source automated ML python framework. Geochemists only need to provide tabulated data and select the desired options to clean data and run ML algorithms. The process operates in a question-and-answer format, and thus does not require that users have coding experience. After either automatic or manual parameter tuning, the automated python framework provides users with performance and prediction results for the trained ML model. Based on the scikit-learn library, Geochemistry pi has established a customized automated process for implementing classification, regression, dimensionality reduction, and clustering algorithms. The python framework enables extensibility and portability by constructing a hierarchical pipeline architecture that separates data transmission from the algorithm application. The AutoML module is constructed using the Cost-Frugal Optimization and Blended Search Strategy hyperparameter search methods from the A Fast and Lightweight AutoML Library, and the model parameter optimization process is accelerated by the Ray distributed computing framework. The MLflow library is integrated into ML lifecycle management, which allows users to compare multiple trained models at different scales and manage the data and diagrams generated. In addition, the front-end and back-end frameworks are separated to build the web portal, which demonstrates the ML model and data science workflow through a user-friendly web interface. In summary, Geochemistry pi provides a python framework for users and developers to accelerate their data mining efficiency with both online and offline operation options. Geochemistry pi is a helpful tool for scientists who work with geochemical data. One of its standout features is its simplicity. Scientists can use the tool to perform machine learning (ML) on the t

关键词： automated machine learning python framework tabular data

来源：评论

学校读者我要写书评

暂无评论

PICTURE-A framework to Assess the Degree of Approximation of Summarized Time Series

引用

IEEE ACCESS 2024年 12卷 178717-178733页

作者： Bianchini, Devis Garda, Massimiliano Univ Brescia Dept Informat Engn I-25123 Brescia Italy

The analysis of time series data, which represents dynamic phenomena through sequences of observations, is greatly influenced by Big Data. Both the sheer volume and the advanced capabilities of Big Data significantly impact on how these analyses are conducted, enabling more comprehensive and detailed insights. Recent studies have promoted the use of data summarization techniques, for instance through incremental clustering, to address the challenges of Big Data volume. These techniques quickly capture data evolution, thereby helping domain experts make informed and proactive decisions by leveraging a concise representation of time series. However, although incremental clustering efficiently reduces data volume and retains key statistical information, it is important to evaluate the accuracy of the summarized version compared to the original time series data. This assessment is critical when the summarized data is used as the basis for complex analytical pipelines, such as those for pattern recognition and anomaly detection. Moved by these premises and starting from an empirical experience on the definition of a metric to assess the adherence of summarised time series to the original data stream, in this paper: (i) we propose a variant of a renowned quality metric for incremental clustering based on an abstract model of clustering data structures, to assess the extent to which the time series summary accurately captures the dynamics of the original data;(ii) we present PICTURE (python-based Incremental Clustering for Time series Representation and Evaluation) a framework featuring four widely used incremental clustering algorithms from the literature, equipped with modules for execution, representation, and evaluation of clustering results applied to time series according to the abstract model;(iii) we conduct an extensive qualitative and quantitative analysis of incremental clustering results on a synthetic and two real-world datasets using the PICTURE framework, to

关键词： Time series analysis Measurement Clustering algorithms Telemetry European Space Agency Time measurement Data models Big Data Accuracy Accelerometers Time series data summarisation incremental clustering data stream analysis python framework

来源：评论

学校读者我要写书评

暂无评论

Web Scraping Tool For Newspapers And Images Data Using Jsonify

JOURNAL OF APPLIED SCIENCE AND ENGINEERING

引用

JOURNAL OF APPLIED SCIENCE AND ENGINEERING 2023年第4期26卷 465-474页

作者： Niu, Qingli Kandhro, Irfan Ali Kumar, Anil Shah, Shahnawaz Hasan, Muhammad Ahmed, Mehfooz Liang, Fei Zhengzhou Univ Sci & Technol Coll Informat Engn Zhengzhou 450064 Peoples R China Sindh Madressatul Islam Univ Dept Comp Sci Karachi Pakistan Univ Sindh Jamshoro Dept Telecommun Engn Jamshoro Pakistan

Web scraping is the process of extracting data from a website in an efficient and fast way. In such a scenario, python programming can offer useful set of methods that help web editors to improve the quality of the provided service. This scraper contains three steps 1) to understand the structure of web page, 2) design regular expression pattern and finally use that pattern to get certain data. In this paper, we also used Flask, Request, JSONify library to get the data, after processing, the data is transformed into the JSON form and ready for CSV with help of API. After generated all required regex patterns, the system uses these patterns as a set of rules, and with this, designed scraper tool works efficiently, and achieved outstanding results with help of support libraries to storing and extracting the news and web-based information. The proposedWeb scraping tool eliminates the time and effort of manually collecting or copying data by automating the process. It is found that this designed scraper is easy and direct approach to extract the newspapers, websites, blogs, and images data.

关键词： web scraping extracting retrieving python framework API manually collecting data

来源：评论

学校读者我要写书评

暂无评论

Kit4DL: Towards fast prototyping and experimentation in machine learning and deep learning

引用

SOFTWAREX 2024年 26卷

作者： Walczak, Jakub Mancini, Marco Alvi, Shahbaz Lodz Univ Technol Inst Informat Technol Al Politech 8 PL-93590 Lodz Poland CMCC Fdn Adv Sci Comp Div Via Marco Biagi 5 I-73100 Lecce Italy

Artificial neural networks, deep learning and machine learning are versatile data-driven tools widely applied in different disciplines such as finance, image and voice recognition, and earth science. For scientists and enthusiasts (including those not very experienced with programming), there is a need for easy-to-use and fast-to-setup tools that enable users to prototype and focus on the research part quickly rather than spending time on data preparation, on extracting features and setup multiple experiments for training and validating models. In this paper, we introduce Kit4DL, which is a python package to speed up the experimentation process of machine- and deep-learning by using just a single TOML configuration file, allowing a user to set up all aspects involved in training and validation. Though simple to use in its default mode, the proposed package enables high customisation possibilities for more experienced users. Kit4DL streamlines the deep learning development process by simplifying the creation of the entire training, validation, and testing loop. Users only need to implement a few core methods outlined in a provided configuration file, significantly reducing development time compared to traditional approaches requiring from a user to implement all procedures him/herself. Additionally, Kit4DL facilitates code reusability by allowing researchers to leverage the same codebase across multiple experiments, reducing redundancy and streamlining the experimentation process.

关键词： Deep learning Machine learning Prototyping python framework

来源：评论

学校读者我要写书评

暂无评论

A two-stage optimization method for improving the load flexibility of existing district energy systems

引用

ENERGY AND BUILDINGS 2023年 301卷

作者： Lin, Quanyi Lu, Shilei Yue, Lu Guo, Tong Tianjin Univ Sch Environm Sci & Engn Tianjin 300072 Peoples R China Tianjin Univ Tianjin Key Lab Built Environm & Energy Applicat Tianjin Peoples R China TU Berlin Hermann Rietschel Inst Str 17Juni 135 D-10623 Berlin Germany Tianjin Univ Sch Environm Sci & Engn 92 Weijin Rd Tianjin 300072 Peoples R China

Decarbonization of district energy systems is essential for China to meet its carbon neutrality goal by 2060. Most existing district energy systems are missing historical load data and have incomplete information, resulting in a lack of data support for the low-carbon transition. Moreover, demand-side load flexibility has not been fully exploited in the planning stage. In this paper, we developed a two-stage computational approach to optimize district loads. We first established an integrated python framework, incorporating the TEASER simulation tool and AixLib model library, to efficiently calculate baseline loads through the bottom-up modeling and simulation of district buildings. Then, a price-based integrated demand response strategy was introduced. A mixed-integer nonlinear programming model was formulated to optimize the energy pricing strategy with the objective of minimum load fluctuations. Finally, a case study was employed to illustrate the feasibility of the calculation method, showing a normalized mean bias error of 7.17%. The results further demonstrated that the strategy could reduce the peak electric and heat loads by 3.55% and 9.57%, and increase load rates by 3.85% and 9.48%, respectively. The strategy could assist district energy service providers to optimize equipment capacity configuration and enhance the low-carbon planning potential of energy systems from the demand-side.

关键词： District energy system Integrated demand response Geographic information system python framework

来源：评论

学校读者我要写书评

暂无评论

Web Scraping Tool For Newspapers And Images Data Using Jsonify

引用

淡江理工学刊 2023年第4期26卷 465-474页

作者： Qingli Niu Irfan Ali Kandhro Anil Kumar Shahnawaz shah Muhammad Hasan Hifza Mehfooz Ahmed Fei Liang

Web scraping is the process of extracting data from a website in an efficient and fast way. In such a scenario, python programming can offer useful set of methods that help web editors to improve the quality of the provided service. This scraper contains three steps 1) to understand the structure of web page, 2) design regular expression pattern and finally use that pattern to get certain data. In this paper, we also used Flask, Request, JSONify library to get the data, after processing, the data is transformed into the JSON form and ready for CSV with help of API. After generated all required regex patterns, the system uses these patterns as a set of rules, and with this, designed scraper tool works efficiently, and achieved outstanding results with help of support libraries to storing and extracting the news and web-based information. The proposed Web scraping tool eliminates the time and effort of manually collecting or copying data by automating the process. It is found that this designed scraper is easy and direct approach to extract the newspapers, websites, blogs, and images data.

关键词： web scraping extracting retrieving python framework API manually collecting data

来源：评论

学校读者我要写书评

暂无评论

Image Processing Methods for Face Recognition using Machine Learning Techniques

Image Processing Methods for Face Recognition using Machine ...

引用

International Conference on Computational Performance Evaluation (ComPE)

作者： Babu, T. R. Ganesh Shenbagadevi, K. Shoba, V. Sri Shrinidhi, S. Sabitha, J. Saravanakumar, U. Muthayammal Engn Coll Kakkaveri Dept ECE Rasipuram India

ISBN: (纸本)9781665436564

The face is one of the simplest ways to distinguish one another's personal image. Face recognition is a personal identification system which uses a person's personal features to recognize the identity of the individual. Human facial identification is basically a two-phase procedure, including face detection, where the process is carried out very rapidly in people, whereas the second is the implementation of environments that classify the face as persons, when the eye is positioned within a short distance. Stage is then repeated and established to be one of the most researched biometric strategies and established by experts for facial expression recognition. In this study, we implemented the area of face detection and face recognition image processing MTCNN techniques while utilizing the VGG face model dataset. In this initiative, python framework is the program necessity.

关键词： Face recognition MTCNN techniques VGG face model python framework Image processing

来源：评论

学校读者我要写书评

暂无评论

A Novel Automate python Edge-to-Edge: From Automated Generation on Cloud to User Application Deployment on Edge of Deep Neural Networks for Low Power IoT Systems FPGA-Based Acceleration

引用

SENSORS 2021年第18期21卷 6050-6050页

作者： Belabed, Tarek Ramos Gomes da Silva, Vitor Quenon, Alexandre Valderamma, Carlos Souani, Chokri Univ Mons Elect & Microelect Unit SEMi B-7000 Mons Belgium Univ Sousse Ecole Natl Ingenieurs Sousse Sousse 4000 Tunisia Univ Monastir Fac Sci Monastir Lab Microelect & Instrumentat Monastir 5019 Tunisia Univ Sousse Inst Super Sci Appl & Technol Sousse Sousse 4003 Tunisia

Deep Neural Networks (DNNs) deployment for IoT Edge applications requires strong skills in hardware and software. In this paper, a novel design framework fully automated for Edge applications is proposed to perform such a deployment on System-on-Chips. Based on a high-level python interface that mimics the leading Deep Learning software frameworks, it offers an easy way to implement a hardware-accelerated DNN on an FPGA. To do this, our design methodology covers the three main phases: (a) customization: where the user specifies the optimizations needed on each DNN layer, (b) generation: the framework generates on the Cloud the necessary binaries for both FPGA and software parts, and (c) deployment: the SoC on the Edge receives the resulting files serving to program the FPGA and related python libraries for user applications. Among the study cases, an optimized DNN for the MNIST database can speed up more than 60x a software version on the ZYNQ 7020 SoC and still consume less than 0.43 W. A comparison with the state-of-the-art frameworks demonstrates that our methodology offers the best trade-off between throughput, power consumption, and system cost.

关键词： cloud computing deep neural networks (DNNs) edge computing field programmable gate array (FPGA) hardware acceleration high-level synthesis (HLS) tools internet of things (IoT) low-power low-cost python framework

来源：评论

学校读者我要写书评

暂无评论

Using Open Source Forensic Carving tools on split dd and EWF files. 10

Using Open Source Forensic Carving tools on split dd and EWF...

引用

EEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)

作者： Palmieri, Gareth Zargari, Shahrzad Sheffield Hallam Univ Dept Comp Sheffield S Yorkshire England

ISBN: (纸本)9781538630662

This study tests a number of open source forensic carving tools to determine their viability when run across split raw forensic images (dd) and Expert Witness Compression Format (EWF) images. This is done by carving files from a raw dd file to determine the baseline before running each tool over the different image types and analysing the results. A framework is then written in python to allow Scalpel to be run across any split dd image, whilst simultaneously concatenating the carved files and sorting by file type. This study tests the framework on a number of scenarios and concludes that this is an effective method of carving files using Scalpel over split dd images.

关键词： open-source forensics file carving Scalpel python framework

来源：评论

学校读者我要写书评

暂无评论

Design and Develop A framework For Social Networking Analysis

Design and Develop A Framework For Social Networking Analysi...

引用

IEEE International Conference on Inventive Computation Technologies (ICICT)

作者： Kaur, Navpreet Singh, Maninder Singh, V. P. Thapar Univ CSED Patiala Punjab India Thapar Univ Patiala Punjab India

Now a days everything around the globe is connected via networks like information, places and events which make a tangle of connections. Analyzing social network is to make sense of these complex connections. This work represents the framework to analyze twitter social media tweets using NetworkX and Twitter API. python language tool Ipython/Jupyter is used to examine the networks by applying visual analytic techniques like degree centrality and betweenness centrality to the dataset of twitter hashtags which provides an easier way to analyze the network connections. This framework describes methodology to diagnose each tweet for identification of certain pattern like 'who talk to whom about what' and 'most influential person' in the interconnected/attached network.

关键词： Social Networking Analysis Centralities Twitter python framework

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：