This research explores the use of synthetic data to enhance the accuracy of machinelearning models predicting the energy production of photovoltaic (PV) systems integrated into buildings. We address the challenge of ...
详细信息
ISBN:
(数字)9798350375589
ISBN:
(纸本)9798350375596;9798350375589
This research explores the use of synthetic data to enhance the accuracy of machinelearning models predicting the energy production of photovoltaic (PV) systems integrated into buildings. We address the challenge of data scarcity in real-world scenarios by generating a large and diverse dataset using BIMSolar, encompassing a wide range of building types, PV panel models, and installation locations. The synthetic data approach allows us to directly estimate energy production for new scenarios without requiring historical data, enabling a shift from forecasting to prediction. We evaluate the performance of various machinelearning models, including Random Forest, Gradient Boosting, and XGBoost, using metrics such as MSE, R-2, MAE, and MAPE. Notably, XGBoost and Decision Tree models emerge as top performers, demonstrating high accuracy and efficiency while consuming minimal energy during training. This research presents a promising approach to support building decarbonization efforts by providing reliable estimates of PV energy production, ultimately facilitating a transition towards a sustainable energy future.
The study focuses on the critical role of credit risk management in the financial system, particularly as financial crises become more frequent and complex financial instruments spread. Traditional credit risk assessm...
详细信息
The work developed aims to identify new applications of machinelearning knowledge, such as computer science and engineering for industry, possible is by grouping bibliographic data where natural language texts must p...
详细信息
Causal effect estimation under networked interference is an important but challenging problem. Available parametric methods are limited in their model space, while previous semiparametric methods, e.g., leveraging neu...
详细信息
Causal effect estimation under networked interference is an important but challenging problem. Available parametric methods are limited in their model space, while previous semiparametric methods, e.g., leveraging neural networks to fit only one single nuisance function, may still encounter misspecification problems under networked interference without appropriate assumptions on the data generation process. To mitigate bias stemming from misspecification, we propose a novel doubly robust causal effect estimator under networked interference, by adapting the targeted learning technique to the training of neural networks. Specifically, we generalize the targeted learning technique into the networked interference setting and establish the condition under which an estimator achieves double robustness. Based on the condition, we devise an end-to-end causal effect estimator by transforming the identified theoretical condition into a targeted loss. Moreover, we provide a theoretical analysis of our designed estimator, revealing a faster convergence rate compared to a single nuisance model. Extensive experimental results on two real-world networks with semisynthetic data demonstrate the effectiveness of our proposed estimators.
As datasets continue to expand, the significance of feature selection in identifying influential features for classification becomes increasingly apparent. Meanwhile, the performance of a classifier has a great impact...
详细信息
machinelearning models employ data for gathering insights, making decisions, and generating predictions. As inferenced data fed into the model may drift or shift over time, it may lead to model's performance degr...
详细信息
ISBN:
(纸本)9798350381771;9798350381764
machinelearning models employ data for gathering insights, making decisions, and generating predictions. As inferenced data fed into the model may drift or shift over time, it may lead to model's performance degradation. Consequently, a model would require re-training. However, model evaluation and frequent re-training might be costly as ground truth labeling can be expensive. Therefore, monitoring data characteristics before and after model deployment can help choosing the appropriate time to re-train the model. This paper proposes a framework of end-to-end data characteristics monitoring within MLOps to provide a solution for smart retraining using a variety of tools for ease of use and cost-effectiveness.
The Myers-Briggs Type Indicator (MBTI) is typifies personality on the basis of four basic dichotomy traits. It has been used by psychologists with diverse applications in real life and clinical settings. Recently ther...
详细信息
ISBN:
(纸本)9783031611391;9783031611407
The Myers-Briggs Type Indicator (MBTI) is typifies personality on the basis of four basic dichotomy traits. It has been used by psychologists with diverse applications in real life and clinical settings. Recently there are attempts to carry out MBTI indexing by machinelearning (ML) techniques applied to several kinds of signals among them textual data extracted from interactions in social networks. In this paper we apply a battery of well known ML approaches to the prediction of MBTI categories based on features extracted by natural language processing (NLP) techniques from textual data extracted from a social network devoted to personality evaluation. The results are in agreement with the literature, showing that prediction of MBTI personality indicator is highly reproducible.
Mental health issues significantly affect society, requiring long-term treatment that is both costly and impacts personal relationships. This research paper proposes a speech-based method to screen for depression risk...
详细信息
The corrosion degradation of organic coatings in tropical marine atmospheric environments results in substantial economic losses across various *** complexity of a dynamic environment,combined with high costs,extended...
详细信息
The corrosion degradation of organic coatings in tropical marine atmospheric environments results in substantial economic losses across various *** complexity of a dynamic environment,combined with high costs,extended experimental periods,and limited data,places a limit on the comprehension of this *** study addresses this challenge by investigating the corrosion de-gradation of damaged organic coatings in a tropical marine environment using an atmospheric corrosion monitoring sensor and a random forest(RF)*** damage simulation,a polyurethane coating applied to a Fe/graphite corrosion sensor was intentionally scratched and exposed to the marine atmosphere for over one *** correlation analysis was performed for the collection and filtering of en-vironmental and corrosion current *** to the RF model,the following specific conditions contributed to accelerated degrada-tion:relative humidity(RH)above 80%and temperatures below 22.5℃,with the risk increasing significantly when RH exceeded 90%.High RH and temperature exhibited a cumulative effect on coating degradation.A high risk of corrosion occurred in the *** RF model was also used to predict the coating degradation process using environmental data as input parameters,with the accuracy show-ing improvement when the duration of influential environmental ranges was considered.
Adversaries minimally perturb deep learning input data to reduce a learning model's ability to produce domain-specific data-driven recommendations to solve specialized tasks. This vulnerability to adversarial pert...
详细信息
暂无评论