In focusing tasks on moving targets, traditional methods that rely on maximizing contrast struggle to capture moving objects due to insufficient focusing speed. Deep learning-based methods have attempted to directly p...
详细信息
Contrastive learning has been widely applied in sequential recommendation to improve the recommendation performance. Existing contrastive learning methods focus on adjusting the views number of positive and negative s...
详细信息
The enhancement (in the past decade) in electronics devices/technology, a rise in most of the accidents concerning security and surveillance intruding the private lives of the users question the existing systems being...
详细信息
We report on progress in modelling and solving Puzznic, a video game requiring the player to plan sequences of moves to clear a grid by matching blocks. We focus here on levels with no moving blocks. We compare a plan...
详细信息
Traditional autofocus methods search for the optimal focal distance (FD) by evaluating image quality from focal stacks, resulting in time-consuming focusing processes. Recently, deep learning has being adopted for sin...
详细信息
Common explanations for shortcut learning assume that the shortcut improves prediction under the training distribution but not in the test distribution. Thus, models trained via the typical gradient-based optimization...
Common explanations for shortcut learning assume that the shortcut improves prediction under the training distribution but not in the test distribution. Thus, models trained via the typical gradient-based optimization of cross-entropy, which we call default-ERM, utilize the shortcut. However, even when the stable feature determines the label in the training distribution and the shortcut does not provide any additional information, like in perception tasks, default-ERM still exhibits shortcut learning. Why are such solutions preferred when the loss for default-ERM can be driven to zero using the stable feature alone? By studying a linear perception task, we show that default-ERM's preference for maximizing the margin leads to models that depend more on the shortcut than the stable feature, even without overparameterization. This insight suggests that default-ERM's implicit inductive bias towards max-margin is unsuitable for perception tasks. Instead, we develop an inductive bias toward uniform margins and show that this bias guarantees dependence only on the perfect stable feature in the linear perception task. We develop loss functions that encourage uniform-margin solutions, called margin control (MARG-CTRL). MARG-CTRL mitigates shortcut learning on a variety of vision and language tasks, showing that better inductive biases can remove the need for expensive two-stage shortcut-mitigating methods in perception tasks.
Human Activity Recognition (HAR) is essential in various applications, including wellness tracking, automated residences, and fitness monitoring. In the past few decades, sensor-based HAR has become increasingly popul...
详细信息
This research investigates the impact of missing data on the performance of machine learning algorithms, with a particular focus on the MIMIC-IV dataset. This project aims to investigate the extent to which missing da...
详细信息
While comparisons between Apache Hadoop and Apache Spark are well-documented, there has been limited research comparing Apache Spark with Apache Airflow, especially in terms of speed and memory usage. With Apache Airf...
详细信息
ISBN:
(数字)9798331507695
ISBN:
(纸本)9798331507701
While comparisons between Apache Hadoop and Apache Spark are well-documented, there has been limited research comparing Apache Spark with Apache Airflow, especially in terms of speed and memory usage. With Apache Airflow's recent introduction of dynamic task mapping, which performs similar functions to Apache Spark's map operation, a detailed comparison between the two tools has become increasingly relevant. A comparison in these areas would provide valuable insights for the Big datascience community, helping determine which methods are better suited for tasks requiring high speed and efficient memory usage. This study focuses on comparing the Apache Spark Map function and Apache Airflow Dynamic Task Mapping function on two key metrics: memory utilization and computation speed. Specifically, we evaluate their performance in sorting formatted electrocardiogram sensory data. We hypoth-esize that Apache Spark will demonstrate faster processing times due to its advanced in-memory processing and sorting algorithms. However, this speed advantage is expected to come with higher memory usage compared to Apache Airflow. Our findings provide actionable insights into the strengths and limitations of these tools, guiding data scientists and engineers in choosing the most suitable framework for specific big data processing tasks. These results are particularly relevant for large-scale data sorting and transformation operations, contributing to informed decision-making in the Big datascience community.
Today more people use social media to express their opinion and their emotions. There are many types of text in social media including text that convey a tendency to be depressed or suicidal. We use sentiment analysis...
详细信息
Today more people use social media to express their opinion and their emotions. There are many types of text in social media including text that convey a tendency to be depressed or suicidal. We use sentiment analysis to detect suicidal texts, because if detected, it could save many lives and many families. In this research, we have an objective to explore a method that is both high performance and less time-using. We design experiments that have 30 combinations between five machine learning models with six feature engineering methods. All experiments use accuracy and total time for model generation as metrics. We use deep neural networks with glove embedding as a comparator because this combination performed well in this dataset on Kaggle competition. From the experimental results, we find that the suitable combination that generates fast and has good accuracy is Random Forest with TF-IDF with 0.897 and 145 seconds.
暂无评论