检索结果-内蒙古大学图书馆

2nd IEEE International Conference on Futuristic Technologies, INCOFT 2023

作者： Sulochana, Balla Charishma Pragada, Bhavya Sri Lokesh, Kollepara Venugopalan, Manju Amrita School of Computing Bengaluru Amrita Vishwa Vidyapeetham Department of Computer Science and Engineering India

ISBN: (纸本)9798350308846

The process of text classification for the purpose of spam detection relies on the principles of natural language processing (NLP) with the objective of automating the identification and categorization of messages into either spam or ham classifications. Our research endeavor involves the determination of the optimal machine learning model that yields the highest accuracy in segregating undesired messages or emails. The procedure for text classification was executed through the utilization of PySpark, employing a variety of machine learning algorithms such as Naive Bayes, Support Vector Machine, Decision Trees, and Random Forest. The dataset used for the experiments was sourced from Kaggle, a publicly accessible repository. In order to rectify data imbalance, Synthetic Minority Over-sampling Technique (SMOTE) was employed to ensure an equal representation of both spam and ham instances. The outcome of our study prominently designates the support vector machine (SVM) as the most precise machine learning model, demonstrating an F1 score of 97.93 and an accuracy score of 97.9. © 2023 IEEE.

关键词： Machine learning models PySpark SMOTE Text classification

来源：评论

学校读者我要写书评

暂无评论

How Graph Neural Networks Learn: Lessons from Training Dynamics 41

How Graph Neural Networks Learn: Lessons from Training Dynam...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Yang, Chenxiao Wu, Qitian Wipf, David Sun, Ruoyu Yan, Junchi School of Artificial Intelligence Department of Computer Science and Engineering MoE Lab of AI Shanghai Jiao Tong University China Amazon Web Services United States School of Data Science The Chinese University of Hong Kong Shenzhen China Shenzhen International Center for Industrial and Applied Mathematics Shenzhen Research Institute of Big Data China

A long-standing goal in deep learning has been to characterize the learning behavior of black-box models in a more interpretable manner. For graph neural networks (GNNs), considerable advances have been made in formalizing what functions they can represent, but whether GNNs will learn desired functions during the optimization process remains less clear. To fill this gap, we study their training dynamics in function space. In particular, we find that the gradient descent optimization of GNNs implicitly leverages the graph structure to update the learned function, as can be quantified by a phenomenon which we call kernel-graph alignment. We provide theoretical explanations for the emergence of this phenomenon in the overparameterized regime and empirically validate it on real-world GNNs. This finding offers new interpretable insights into when and why the learned GNN functions generalize, highlighting their limitations in heterophilic graphs. Practically, we propose a parameter-free algorithm that directly uses a sparse matrix (i.e. graph adjacency) to update the learned function. We demonstrate that this embarrassingly simple approach can be as effective as GNNs while being orders-of-magnitude faster. Copyright 2024 by the author(s)

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

ETL data Pipeline to Analyze Scraped data 3rd

ETL Data Pipeline to Analyze Scraped Data

引用

3rd International Conference on Information Technology, InCITe 2023

作者： Gandhi, Rashmi Khurana, Sparsh Manchanda, Harsh Department of Computer Science and Engineering Amity School of Engineering and Technology Amity University Noida India

ISBN: (纸本)9789819959969

In this day and age, corporations are moving towards data driven decision making. They are utilizing data to make applications that can assist them with business intelligence. data assists them with working on their efficiency and income by chipping away at the business challenges. By utilizing data, corporations can generate market patterns, improve client experience and enable improved business decisions. The key problem is the abundance of data and its form. data needs to be in a form that can be effectively be used to generate tangible insights. The complete cycle of taking data and then generating insights from it is a long one but there are modern solutions to this problem which simplify and improve this process. The scope of this literary research is to examine the use of data Pipelining to enable better data insights. data pipelines can be utilized to move data starting with one spot then onto the next using ETL (Extract Transform Load), ELT (Extract Load Transform), data improvement, and continuous data examination. An automated data pipeline is a relatively new solution which can provide impressive results without the need for manual intervention every time. The result of automated pipelines is automatic processing of bigdata from source to destination without any user input. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Decision making

来源：评论

学校读者我要写书评

暂无评论

Identification of Stages of Ripening of Dragon Fruit Using Neural Networks for Smart Agriculture 2

Identification of Stages of Ripening of Dragon Fruit Using N...

引用

2nd International Conference on Edge Computing and Applications, ICECAA 2023

作者： Abhishek, G. Prabhu, Akshatha Rani, N. Shobha Amrita School of Computing Amrita Vishwa Vidyapeetham Department of Computer Science Mysuru Campus India

ISBN: (纸本)9798350347579

Dragon fruit is a popular fruit with a unique appearance and taste. It is an important fruit in export and domestic markets. However, its maturity detection is still a challenging task due to the complexity of its physical properties. This research study introduces a new approach by utilizing the VGG16 model and SVM to detect the maturity of dragon fruit. For the purpose of increasing the datasets, the data augmentation techniques were applied that was followed by preprocessing, thresholding, edge detection and contour detection, and extracting the ROI. The segmented images were then sent to the VGG-16 model that provided accuracy of 95.93%, 95.31% and 96.54 % for unripe, partially ripe and ripe stages. The features extracted for the fruit region are mean, standard deviation, entropy, contrast, correlation, Inverse difference moments. These are fed to the SVM classifier that generated accuracy of 91.93%, 91.93 % and 92.54% accuracy for unripe, partially ripe and ripe stage -16 performed better than SVM classifier. © 2023 IEEE.

关键词： Fruits

来源：评论

学校读者我要写书评

暂无评论

Enhancing Airline Operations by Flight Delay Prediction - A PySpark Framework Approach 1

Enhancing Airline Operations by Flight Delay Prediction - A ...

引用

1st IEEE International Conference on Ambient Intelligence, Knowledge Informatics and Industrial Electronics, AIKIIE 2023

作者： Anand, R.N. Supriya, M. Amrita School of Computing Department of Computer Science & Engineering Amrita Vishwa Vidyapeetham Bengaluru India

ISBN: (纸本)9798350316469

Airline delay prediction plays a crucial role in the aviation industry, enabling airlines to optimize operations and improve passenger satisfaction. In this research, we propose a comprehensive framework for airline delay prediction using PySpark, a distributed data processing framework. Our study incorporates four widely used machine learning algorithms: Logistic Regression Classifier, Random Forest Classifier, Gradient Boosted Tree Classifier and Decision Tree Classifier. Our experimental results demonstrate the potential of the proposed framework. The identified classifiers when implemented achieve promising results in terms of predictive accuracy, highlighting their suitability for airline delay prediction. By comparing and contrasting these algorithms, we provide insights into their respective strengths and limitations. This research presents a robust framework for airline delay prediction, leveraging PySpark's distributed data processing capabilities. The experimental evaluation of multiple machine learning algorithms highlights their efficacy in accurately predicting flight delays. The deployed prediction system showcases the practical application of our framework, paving the way for improved operational efficiency and enhanced passenger experiences in the aviation industry. From the experiments, logistic regression classifier and random forest classifier had better results. © 2023 IEEE.

关键词： Big data Decision Tree Gradient Boost Tree Classifier Logistic Regression PySpark Randon Forest

来源：评论

学校读者我要写书评

暂无评论

Solar Irradiance Prediction Model Based on LSTM 3

Solar Irradiance Prediction Model Based on LSTM

引用

3rd IEEE Asian Conference on Innovation in Technology, ASIANCON 2023

作者： Naga Aditya, Swarna Venkata Bhuvaneswari, R. Natarajan, B. Dhavakumar, P. Amrita School of Computing Amrita Vishwa Vidyapeetham Department of Computer Science and Engineering Chennai India

ISBN: (纸本)9798350302288

The most available natural resource on earth is solar energy, and it is the best source to produce electricity. As for the growing needs of humans, there is a huge need for electricity, so this model is a step towards sustainable development, which helps to predict the solar radiation that the earth will receive in the future and with which humans can make efficient use of solar energy. There are so many techniques to store solar energy, but there is no correct estimation of when and where the maximum amount of energy can be received, so with the help of the proposed method, the amount of energy that can be received will be estimated. The proposed model expresses an idea-based, most advanced machine learning technique called LSTM in order to predict solar irradiance. There are models that exist that will estimate the energy with image data, and some other models will take text data and estimate very short-term predictions. The proposed model achieves better prediction rate for solar irradiance using LSTM with less error. © 2023 IEEE.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

EPCA—Enhanced Principal Component Analysis for Medical data Dimensionality Reduction

引用

SN computer science 2023年第3期4卷 243页

作者： Vinutha, M.R. Chandrika, J. Krishnan, Balachandran Kokatnoor, Sujatha Arun Department of Information Science and Engineering Malnad College of Engineering Hassan 573202 India Department of Computer Science and Engineering School of Engineering and Technology CHRIST (Deemed to be University) Bangalore 560074 India

Innovations in technology from the last one decade have led to the generation of colossal amounts of medical data with comparably low cost. Medical data should be collected with utmost care. Sometimes, the data have high features but not all the features play an important role in drawing the relations to the mining task. For the training of machine learning algorithms, all the attributes in the data set are not relevant. Some of the characteristics may be negligible and some characteristics may not influence the outcome of the forecast. The pressure on machine learning algorithms can be minimized by ignoring or taking out the irrelevant attributes. Reducing the attributes must be done at the risk of information loss. In this research work, an Enhanced Principal Component Analysis (EPCA) is proposed, which reduces the dimensions of the medical dataset and takes paramount care of not losing important information, thereby achieving good and enhanced outcomes. The prominent dimensionality reduction techniques such as Principal Component Analysis (PCA), Singular Value Decomposition (SVD), Partial Least Squares (PLS), Random Forest, Logistic Regression, Decision Tree and the proposed EPCA are investigated on the following Machine Learning (ML) algorithms: Support Vector Machine (SVM), Artificial Neural Networks (ANN), Naïve Bayes (NB) and Ensemble ANN (EANN) using statistical metrics such as F1 score, precision, accuracy and recall. To optimize the distribution of the data in the low-dimensional representation, EPCA directly mapped the data to a space with fewer dimensions. This is a result of feature correlation, which made it easier to recognize patterns. Additionally, because the dataset under consideration was multicollinear, EPCA aided in speeding computation by lowering the data's dimensionality and thereby enhanced the classification model's accuracy. Due to these reasons, the experimental results showed that the proposed EPCA dimensionality reduction technique per

关键词： Accuracy Artificial neural networks Liver cirrhosis Machine learning Medical data Principal component analysis Singular value decomposition Support vector machine

来源：评论

学校读者我要写书评

暂无评论

Voice data-Mining on Audio from Audio and Video Clips

Voice Data-Mining on Audio from Audio and Video Clips

引用

World Conference on Information Systems for Business Management, ISBM 2022

作者： Sai Tharun, A. Dhivakar, K. Nair Prashant, R. Department of Computer Science and Engineering Amrita School of Engineering Amrita Vishwa Vidyapeetham Coimbatore India

ISBN: (纸本)9789811974465

The lockdown due to Covid-19 has resulted in us relying heavily on technology to communicate and see each other. That, in turn, has led to an explosion of applications that provide audio and video conferencing solutions;which, in turn, has resulted in the generation of vast volumes of visual and auditory data. This ocean of data is mainly untapped and has vast potential to help us build applications that can one day hope to, truly, assist people. All this data, particularly auditory data, can be used to build applications that can support us by communicating with us and understanding what we say, making it comfortable and closer to humans. This data can also be processed to give us more insights into the behaviour of humans and what makes us truly us. All this and many more insights, which are waiting to be tapped, can be garnered from this goldmine of data, thereby allowing us to bridge the huge gap between man and machine. © 2023, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Diagnosis of Glaucoma Through the Analysis of Saccadic Eye Movements Employing Machine Learning Methods 4

Diagnosis of Glaucoma Through the Analysis of Saccadic Eye M...

引用

4th IEEE International Conference on Computing, Communication, and Intelligent Systems, ICCCIS 2023

作者： Kumar, R V B S Prasanth Amudha, J. Krishnan, Sajitha Amrita School of Computing Amrita Vishwa Vidyapeetham Department of Computer Science and Engineering Bengaluru India

ISBN: (纸本)9798350306118

This study addresses the significant public health concern of glaucoma, a collection of eye conditions resulting in optic nerve damage and progressive vision loss. To conduct our research, we employed a dataset obtained from collaborative efforts between the Amrita school of Engineering in Bangalore and Narayana Nethralaya Hospital during visual field examinations. The dataset, named EXGP, comprises Humphrey Visual Field tests conducted on a sample of 111 patients. Our research investigates the impact of variations in saccadic eye movements on glaucoma performance. This hypothesis is tested through an examination of diverse metrics associated with saccadic eye movements during visual search tasks. The summarizing parameter, Visual Field Index (VFI), is employed. However, for its widespread acceptance, further characterization and comparison with conventional indices used in glaucoma diagnosis, specifically Mean Deviation (MD), are required. Our objective is to define the applicability of VFI in advanced glaucoma cases by contrasting it with MD and the criteria for establishing blindness. Through statistical analysis of confidential eye data, we generated data samples from the provided dataset. Subsequently, a range of machine learning algorithms, including Support Vector Machine (SVM), K nearest neighbors (Knn), and Logistic Regression (LR), were utilized to predict the presence of glaucoma or normality in individuals. Classification accuracy for the various machine learning models was assessed using appropriate evaluation metrics. Among the three models, Logistic Regression demonstrated the highest accuracy of 0.94 in performing the classification task. © 2023 IEEE.

关键词： Support vector machines

来源：评论

学校读者我要写书评

暂无评论

Machine Learning Methods for Predictive Customer Churn Analysis in the Telecom Industry 14

Machine Learning Methods for Predictive Customer Churn Analy...

引用

14th International Conference on Computing Communication and Networking Technologies, ICCCNT 2023

作者： Navienkumar, R. Lalithamani, N. Amrita School of Computing Amrita Vishwa Vidyapeetham Department of Computer Science and Engineering Coimbatore India

ISBN: (纸本)9798350335095

To examine customer churn in the telecom sector, we used four classification algorithms: Decision-Tree(DT), Random-Forest(RF), Logistic-Regression(LR), and a Hybrid-Algorithm(HA). The dataset contained information about telecom users, including demographics, use trends, and customer service data. By resolving missing values and outliers during the preprocessing of the data, we were able to comprehend the variable distribution and connections. The four classification models were then applied to the dataset by dividing it into training and testing sets. The models' accuracy outcomes were as follows: The accuracy rates for Logistic-Regression (LR), Decision-Tree (DT), Random-Forest (RF), and the Hybrid Algorithm(HA), which combines DT and RF capabilities, were 85%, 91%, 94%, and 95%, respectively. Utilising measures like Accuracy, Precision, Recall and recall rate, we evaluated the models' performance. Our research showed that the Random-Forest (RF) model performed best and had the highest degree of accuracy. The hybrid approach greatly increased classification accuracy, underscoring its usefulness. The Hybrid Algorithm(HA) may be used by telecom businesses to estimate customer attrition and aggressively retain clients. These findings highlight how crucial it is to use the right algorithms for accurate customer attrition prediction, with the Hybrid Algorithm (HA) offering a practical way to raise classification accuracy. © 2023 IEEE.

关键词： Decision trees

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：