Optimizers are essential in deep learning to minimize error functions and ensure efficient learning. this paper evaluates the performance of prominent optimizers, including SGD, SGD with momentum, RMSprop, AdaGrad, Ad...
详细信息
ISBN:
(数字)9798331533663
ISBN:
(纸本)9798331533670
Optimizers are essential in deep learning to minimize error functions and ensure efficient learning. this paper evaluates the performance of prominent optimizers, including SGD, SGD with momentum, RMSprop, AdaGrad, Adam, and AdamW, across varied architectures and datasets. Tasks such as sentiment classification using LSTM, image classification using CNN, and reinforcement learning are examined based on convergence speed, accuracy, and hyperparameter robustness. the results reveal that while SGD exhibits consistent performance with momentum tuning, adaptive optimizers such as Adam achieve faster convergence, but may reduce generalization in specific datasets. A hyperparameter adjustment sensitivity analysis is conducted, offering practical guidelines for optimizer selection tailored to task requirements and computational constraints.
Dysgraphia, a major learning disorder that primarily interferes with writing skills can hinder the academic track of children unless recognized in the early stage. the diversity in the symptoms, as well as the emergen...
详细信息
the paper presents a comprehensive in-depth analysis of big data, machine learning (ML), and deep learning (DL) methodologies in predictive healthcare analytics, with a focus on their comparative strengths, research g...
详细信息
ISBN:
(数字)9798331529635
ISBN:
(纸本)9798331529642
the paper presents a comprehensive in-depth analysis of big data, machine learning (ML), and deep learning (DL) methodologies in predictive healthcare analytics, with a focus on their comparative strengths, research gaps, and applications in real world. the study synthesizes findings from recent research papers related to same objective, highlighting the evolution of data-driven technologies in healthcare sector and the challenges which are associated with data integration, privacy, and real-time processing. the big data analytics provides foundational insights from vast healthcare datasets, while ML algorithms enhance predictive accuracy on structured data, while DL models achieve high precision even in complex tasks, such as medical imaging and real-time IoT embedded systems monitoring. this article introduces a novel hybrid approach that combines the strength of all these approaches, employing common data for heterogenous data integration, federated learning for privacy, and a feedback-driven adaptive system for continuous improvement. Analyzing quantitative comparisons, the effectiveness of each approach is illustrated: DL demonstrate high acuracy, and ML, in turn, flexibility with a moderate rate. Discussing the research gaps of current methodologies, the proposed framework discus the scalability, adaptability, and privacy enhancements of this new conceptual model to create more precise and proactive healthcare analytics. Possible future investigations indicate that it is necessary to enhance these adaptive learning solutions and to expand the arbitrary federative learning approaches to treat different interdisciplinary domains.
this study aims to explore the performance improvement method of large language models based on GPT-4 under the multi-task learning framework and conducts experiments on two tasks: text classification and automatic su...
详细信息
ISBN:
(数字)9798331534622
ISBN:
(纸本)9798331534639
this study aims to explore the performance improvement method of large language models based on GPT-4 under the multi-task learning framework and conducts experiments on two tasks: text classification and automatic summary generation. through the combined design of shared feature extractors and task-specific modules, we achieve knowledge-sharing and optimization of multiple tasks in the same model. the experiment uses multiple subtasks of the GLUE dataset to compare the performance of the multi-task model withthe single-task GPT-4, the multi-task version of GPT-3, the BERT basic model, and the classic Bi-LSTM with Attention model. the results show that the proposed multi-task learning model outperforms other comparison models in terms of text classification accuracy and ROUGE value of summary generation, demonstrating the advantages of multi-task learning in improving model generalization ability and collaborative learning between tasks. the model maintains a stable loss convergence rate during training, showing good learning efficiency and adaptability to the test set. this study verifies the applicability of the multi-task learning framework in large language models, especially in improving the model's ability to balance different tasks. In the future, withthe combination of large language models and multimodal data and the application of dynamic task adjustment technology, the framework based on multi-task learning is expected to play a greater role in practical applications across fields and provide new ideas for the development of general artificial intelligence.
In situ imageomics is a new approach to study ecological, biological and evolutionary systems wherein large image and video data sets are captured in the wild and machine learning methods are used to infer biological ...
详细信息
ISBN:
(纸本)9798350337440
In situ imageomics is a new approach to study ecological, biological and evolutionary systems wherein large image and video data sets are captured in the wild and machine learning methods are used to infer biological traits of individual organisms, animal social groups, species, and even whole ecosystems. Monitoring biological traits over large spaces and long periods of time could enable new, data-driven approaches to wildlife conservation, biodiversity, and sustainable ecosystem management. However, to accurately infer biological traits, machine learning methods for images require voluminous and high quality data. Adaptive, data-driven approaches are hamstrung by the speed at which data can be captured and processed. Camera traps and unmanned aerial vehicles (UAVs) produce voluminous data, but lose track of individuals over large areas, fail to capture social dynamics, and waste time and storage on images with poor lighting and view angles. In this vision paper, we make the case for a research agenda for in situ imageomics that depends on significant advances in autonomic and self-aware computing. Precisely, we seek autonomous data collection that manages camera angles, aircraft positioning, conflicting actions for multiple traits of interest, energy availability, and cost factors. Given the tools to detect object and identify individuals, we propose a research challenge: Which optimization model should the data collection system employ to accurately identify, characterize, and draw inferences from biological traits while respecting a budget? Using zebra and giraffe behavioral data collected over three weeks at the Mpala Research Centre in Laikipia County, Kenya, we quantify the volume and quality of data collected using existing approaches. Our proposed autonomic navigation policy for in situ imageomics collection has an F1 score of 82% compared to an expert pilot, and provides greater safety and consistency, suggesting great potential for state-of-the-art autono
the increase in demand for energy increases the complexity of energy systems every day because of the increase in electricity consumption. Complex electrical energy systems consist of many equipment and parts from gen...
the increase in demand for energy increases the complexity of energy systems every day because of the increase in electricity consumption. Complex electrical energy systems consist of many equipment and parts from generation, transmission and distribution, and many power and generation systems do not increase at the same rate required for consumption and are vulnerable to damage in this paper, the major priority is on modern methods of detection, classification and analysis of many kinds of faults in electrical power systems, using machine learningalgorithms, using Python and a library scikit-learn, which provides algorithms for supervised machine learning, data is processed using data science offices NumPy, pandas and matplotlib, and the detection and classification data is evaluated using 9 types of algorithmsthat are available in machine learning which are Support vector machine (SVMs), Logistics regression, Linear regression, Polynomial regression, Random Forest (RF), k-Nearest Neighbor (KNN), (MLP) Multi-layer perceptron, Naive Bayes, Decision Tree, as well as all models they were compared and to be able to determine the best models which was created by the algorithms.
Brain tumors are cancer causing cells which when developed inside a person’s brain can cause them life threatening challenges. the major objective of this study is to detect brain tumors in the early stages of the di...
详细信息
ISBN:
(数字)9798350349900
ISBN:
(纸本)9798350349917
Brain tumors are cancer causing cells which when developed inside a person’s brain can cause them life threatening challenges. the major objective of this study is to detect brain tumors in the early stages of the disease by using a deep learning algorithm, YOLO v8, in the hopes of increasing the survival rates of the patients. the dataset is acquired from kaggle which contains annotated MRI images of brain tumors that belong to either meningioma, glioma, no tumor and pituitary class. To optimize the training process, diverse sizes and optimizers are used. through this extensive experimentation the findings are recorded. the findings highlight the importance of optimization strategies in deep learning-based analysis of medical images and demonstrate the capability of YOLO v8 as a robust tool for brain tumor detection.
Long-tailed classification on graphs is ubiquitous yet challenging in many real-world applications. Recently, oversampling approaches have shown promising performance on imbalanced classification tasks. However, most ...
详细信息
ISBN:
(纸本)9789819770007;9789819770014
Long-tailed classification on graphs is ubiquitous yet challenging in many real-world applications. Recently, oversampling approaches have shown promising performance on imbalanced classification tasks. However, most oversampling methods determine the edge based on the principle of similarity or edge generator, this leads to the generation nodes with poor topological diversity and high homogeneity, which seriously affects the performance of the classifier. To bridge this gap, this paper presents an oversampling method based on Graph Topology Assisted Classifier GAN, named GT-ACGAN. Unlike existing methods, we present a latent variable-based graph topology generation method, to learn the topology of the graph. this approach first pretrained a VGAE model using the original graph and gets the original graph's latent variables by the VGAE encoder. then, the random noises are transformed into latent variables of the fake nodes by neural networks. Finally, the VGAE decoder reconstructs these latent variables into a new adjacency matrix. then we use the latent variable-based generation method as GT-ACGAN's generator to generate balanced graph data, and GT-ACGAN's discriminator is trained to distinguish the authenticity of each node its category. this approach helps the model to generate graph data with typologies diversity and reconstruct erroneous data generation. To validate the effectiveness of the GT-ACGAN, extensive experiments were conducted on manual long-tailed graphs from three classic citation network datasets (Cora, Citeseer, and PubMed) for constructing long-tailed graphs and three different classic graph classifiers (GCN, GraphSAGE, and GAT). GT-ACGAN outperforms state-of-the-art oversampling algorithms for long tail node classification tasks.
the increasing demand for digital content is putting a strain on network infrastructures, prompting a shift towards edge server architecture, which aims to decentralize data processing and reduce content delivery late...
详细信息
ISBN:
(数字)9798350391183
ISBN:
(纸本)9798350391190
the increasing demand for digital content is putting a strain on network infrastructures, prompting a shift towards edge server architecture, which aims to decentralize data processing and reduce content delivery latency. this study investigates edge server caching optimization using Deep Reinforcement learning (DRL). the motivation stems from the need to alleviate the burden on origin servers and to expedite content delivery. An edge server locates closer to the users in a network rather than relying on a centralized server. DRL contains an agent learns to make decisions through trial and error. Experiments conducted across different simulated user request patterns show that modifications can enhance cache hit rates.
For obtaining the maximum benefit, the Electrical Fused Magnesium Group Furnace (EFMGF) participate of power system Frequency Regulation Auxiliary Service (FRAS), need to combine their own operating characteristics to...
For obtaining the maximum benefit, the Electrical Fused Magnesium Group Furnace (EFMGF) participate of power system Frequency Regulation Auxiliary Service (FRAS), need to combine their own operating characteristics to develop participation in the service of the operation control strategy, and consider the regulation characteristics and multiple uncertainties to build model for Day-ahead Reported Capacity (DRC) optimization of EFMGF participate in the Primary Frequency Regulation (PFR). Based on analysis of the characteristics of energy use, operating characteristics and adjustment characteristics, the control mechanism for the EFMGF participate in the PFR is proposed and the Frequency Regulation (FR) characteristics of the Electrical Fused Magnesium Furnace (EFMF) is deduced accordingly. Withthe goal of maximizing the overall profitability of the Electrical Fused Magnesium Enterprise (EFME) and taking into account the quality of products, the limitation of the energy requirement and the demand for FR, the optimized model for FMGF participate in the PFR is established to optimize DRC. Aiming at the multiple uncertainty problems such as the uncertainty of time and power and the randomness of the frequency regulation signals (FRSs) in the conversion of the operating conditions of the EFMF, a two-dimensional scenario matrix is constructed, which can be realized to solve the optimized model containing complex uncertainty factors. Simulation cases verify the effectiveness of the proposed control strategy, and the proposed optimized model can obtain the optimal reported capacity.
暂无评论