How much should you charge someone to live in your house? Or how much would you pay to live in someone else's house? Would you pay more or less for a planned vacation or for a spur-of-the-moment getaway? Answering...
详细信息
How much should you charge someone to live in your house? Or how much would you pay to live in someone else's house? Would you pay more or less for a planned vacation or for a spur-of-the-moment getaway? Answering these questions isn't easy. And the struggle to do so, my colleagues and I discovered, was preventing potential rentals from getting listed on our site-Airbnb, the company that matches available rooms, apartments, and houses with people who want to book them. In focus groups, we watched people go through the process of listing their properties on our site-and get stumped when they came to the price field. Many would take a look at what their neighbors were charging and pick a comparable price; this involved opening a lot of tabs in their browsers and figuring out which listings were similar to theirs. Some people had a goal in mind before they signed up, maybe to make a little extra money to help pay the mortgage or defray the costs of a vacation. So they set a price that would help them meet that goal without considering the real market value of their listing. And some people, unfortunately, just gave up. Clearly, Airbnb needed to offer people a better way-an automated source of pricing information to help hosts come to a decision. That's why we started building pricing tools in 2012 and have been working to make them better ever since. This June, we released our latest improvements. We started doing dynamic pricing- that is, offering new price tips daily based on changing market conditions. We tweaked our general pricing algorithms to consider some unusual, even surprising characteristics of listings. And we've added what we think is a unique approach to machinelearning that lets our system not only learn from its own experience but also take advantage of a little human intuition when necessary.
Deep learning has emerged as the preferred modeling approach for automatic ECG analysis. In this study, we investigate three elements aimed at improving the quantitative accuracy of such systems. These components cons...
详细信息
Deep learning has emerged as the preferred modeling approach for automatic ECG analysis. In this study, we investigate three elements aimed at improving the quantitative accuracy of such systems. These components consistently enhance performance beyond the existing state-of-the-art, which is predominantly based on convolutional models. Firstly, we explore more expressive architectures by exploiting structured state space models (SSMs). These models have shown promise in capturing long-term dependencies in time series data. By incorporating SSMs into our approach, we not only achieve better performance, but also gain insights into long-standing questions in the field. Specifically, for standard diagnostic tasks, we find no advantage in using higher sampling rates such as 500 Hz compared to 100 Hz. Similarly, extending the input size of the model beyond 3 seconds does not lead to significant improvements. Secondly, we demonstrate that self-supervised learning using contrastive predictive coding can further improve the performance of SSMs. By leveraging self-supervision, we enable the model to learn more robust and representative features, leading to improved analysis accuracy. Lastly, we depart from synthetic benchmarking scenarios and incorporate basic demographic metadata alongside the ECG signal as input. This inclusion of patient metadata departs from the conventional practice of relying solely on the signal itself. Remarkably, this addition consistently yields positive effects on predictive performance. We firmly believe that all three components should be considered when developing next-generation ECG analysis algorithms.
Prediction of solar energy data is very crucial for the effective utilization of freely available renewable energy abundantly in nature. Solar energy data are widely available which must be carefully prepared and arra...
详细信息
Prediction of solar energy data is very crucial for the effective utilization of freely available renewable energy abundantly in nature. Solar energy data are widely available which must be carefully prepared and arranged for modelling. In this work, typical meteorological year (TMY) data made available by the Korea institute of energy research (KIER) and the National renewable energy laboratory (NREL) are used for modelling in different phases. TMY data at single-point location and multiple locations from KIER are initially used for training of machinelearning (ML) algorithms. Later, the TMY data from NREL and KIER are combined and then modelled using radius nearest neighbour (RNN), decision tree regressor (DTR), random forest regressor (RFR), and X-gradient boosting (XGB) algorithms. The solar energy parameters modelled in this work are dew point temperature (DPT), dry bulb temperature (DBT), relative humidity (RH), surface pressure (SP), windspeed (WS), and solar insolation of horizontal plane (IHP). Quantitative analysis of the algorithms is also performed in each stage of the work. The modelling indicates that the DBT, DPT, RH, and SP are able to be predicted with a minimum accuracy of over 90% in each stage. The WS and IHP data when modelled from a single-source TMY data provide superior accuracy than when they are combined. RFR and XGB have outperformed overall as they provide good accuracy for WS and IHP data as well. RNN and DTR achieved 100% accuracy in training, while RFR and XGB showed slightly lower training accuracy due to their avoidance of overfitting. There are errors in testing for RNN/DTR. Using RNN/DTR, the training errors are 0% in all cases, while in some cases like DTP the error by RFR/XGB up to 3%, whereas RNN/DTR testing errors go up to 5% and in case of RFR/XGB they are up to 7.5%. For RH modelling RFR/XGB, training errors are max 6%. RNN/DTR testing errors go up to 11%, while for RFR/XGB up to 7.5% which indicates their robustness. It is
Recent advances in single-cell technologies have enabled high-resolution characterization of tissue and cancer compositions. Although numerous tools for dimension reduction and clustering are available for single-cell...
详细信息
Recent advances in single-cell technologies have enabled high-resolution characterization of tissue and cancer compositions. Although numerous tools for dimension reduction and clustering are available for single-cell data analyses, these methods often fail to simultaneously preserve local cluster structure and global data geometry. To address these challenges, we developed a novel analyses framework, Single-Cell Path Metrics Profiling (scPMP), using power-weighted path metrics, which measure distances between cells in a data-driven way. Unlike Euclidean distance and other commonly used distance metrics, path metrics are density sensitive and respect the underlying data geometry. By combining path metrics with multidimensional scaling, a low dimensional embedding of the data is obtained which preserves both the global data geometry and cluster structure. We evaluate the method both for clustering quality and geometric fidelity, and it outperforms current scRNAseq clustering algorithms on a wide range of benchmarking data sets. Advancements in single-cell technologies with the ability to measure gene expression at the cellular level have provided unprecedented opportunity to investigate the cell type (T cells, B cells, etc) and cell state diversity (active T cells and exhausted T cells) within tissues and cancers. However, analyzing this complex high-dimensional data when the noise level is high requires sophisticated tools to effectively extract useful biological information and faithfully visualize the data in a low-dimensional space (2- or 3-D). Existing computational methods such as dimension reduction and clustering (group similar cells together) for single-cell data struggle to simultaneously preserve local group structure and global data geometry (developmental relationship between cell types). To tackle this problem, we've developed a new analysis framework called scPMP (Single-Cell Path Metrics Profiling) based on a unique approach to measure distances betwe
Finding strongly connected components (SCCs) and the diameter of a directed network play a key role in a variety of machinelearning and control theory problems. In this article, we provide for the first time a scalab...
详细信息
Finding strongly connected components (SCCs) and the diameter of a directed network play a key role in a variety of machinelearning and control theory problems. In this article, we provide for the first time a scalable distributed solution for these two problems by leveraging dynamical consensus-like protocols to find the SCCs. The proposed solution has a time complexity of O(NDd(max) (in-degree)), where N is the number of vertices in the network, D is the (finite) diameter of the network, and d(max) (in-degree) is the maximum in-degree of the network. Additionally, we prove that our algorithm terminates in D + 2 iterations, which allows us to retrieve the finite diameter of the network. We perform exhaustive simulations that support the outperformance of our algorithm against the state of the art on several random networks, including Erdo?s-Renyi, Barabasi-Albert, and Watts-Strogatz networks.
Path finding is used to solve the problem of finding a traversable path through an environment with obstacles. This problem can be seen in many different fields of study and these areas rely on fast and efficient path...
详细信息
Path finding is used to solve the problem of finding a traversable path through an environment with obstacles. This problem can be seen in many different fields of study and these areas rely on fast and efficient path finding algorithms. This paper aims to describe and review state of the art optimization techniques that are used on optimized path finding and compare their performances. Moreover, a special attention is paid on the proposed approaches to identify how they are tested on different test cases;whether the test cases are automatically generated or benchmark instances. The review opens avenues about the importance of automatic test case generation to test the different path finding algorithms.
In many non-stationary environments, machine learning algorithms usually confront the distribution shift scenarios. Previous domain adaptation methods have achieved great success. However, they would lose algorithm ro...
详细信息
In many non-stationary environments, machine learning algorithms usually confront the distribution shift scenarios. Previous domain adaptation methods have achieved great success. However, they would lose algorithm robustness in multiple noisy environments where the examples of source domain become corrupted by label noise, feature noise, or open-set noise. In this paper, we report our attempt toward achieving noise-robust domain adaptation. We first give a theoretical analysis and find that different noises have disparate impacts on the expected target risk. To eliminate the effect of source noises, we propose offline curriculum learning minimizing a newly-defined empirical source risk. We suggest a proxy distribution-based margin discrepancy to gradually decrease the noisy distribution distance to reduce the impact of source noises. We propose an energy estimator for assessing the outlier degree of open-set-noise examples to defeat the harmful influence. We also suggest robust parameter learning to mitigate the negative effect further and learn domain-invariant feature representations. Finally, we seamlessly transform these components into an adversarial network that performs efficient joint optimization for them. A series of empirical studies on the benchmark datasets and the COVID-19 screening task show that our algorithm remarkably outperforms the state-of-the-art, with over 10% accuracy improvements in some transfer tasks.
Artificial neural networks are a powerful tool for spatial and temporal functions approximation. This study introduces a novel approach for modeling non-Newtonian fluid flows by minimizing a proposed power loss metric...
详细信息
Artificial neural networks are a powerful tool for spatial and temporal functions approximation. This study introduces a novel approach for modeling non-Newtonian fluid flows by minimizing a proposed power loss metric, which aligns with the variational formulation of boundary value problems in hydrodynamics and extends the classical Lagrange variational principle. The method is distinguished by its data-free nature, enabling problem-solving through 2D or 3D images of the flow domain. Validation was performed using both multi-layer perceptrons and U-Net architectures, with results compared against analytical and numerical benchmarks. The method demonstrated good results with a relative error of 1.41% in comparison with the analytical solution for non-Newtonian fluids. The power loss formulation offers a clear advantage by simplifying the modeling process and enhancing interpretability. Notably, the proposed method demonstrates improvements over existing techniques by providing algorithmic simplicity and universality, with applications ranging from blood flow modeling in vessels and tissues to broader hydrodynamic scenarios.
Smart home systems have become more and more prevailent in recent years. On the one hand, they make our everyday life more convenient;on the other hand, they suffer from the two notorious security problems, namely, th...
详细信息
Smart home systems have become more and more prevailent in recent years. On the one hand, they make our everyday life more convenient;on the other hand, they suffer from the two notorious security problems, namely, the open-port problem and the overprivilege problem, making their security situations extremely worrying and uncheerful. In this article, we proposed HomeShield, a novel credential-less authentication framework to shield smart home systems by effectively defending against the attacks resulted from these two security problems without the need for sensitive credentials. We further detailed an implementation of HomeShield based on the side channels that are publicly available in Android smartphones serving as controllers of smart home systems and presented its workflow in protecting against various attacks caused by the open-port and overprivilege problems. Finally, we tested our HomeShield implementation on a real-world smart home system and considered four threat models that cover basically all practical attacks, including Mirai and its variants. We also considered the effectiveness of our HomeShield implementation on the SmartApps of the Samsung SmartThings platform, which also suffers from the open-port and overprivilege problems, even though its overprivilege issue has been extensively studied by the recently proposed works, such as ContexIoT and SmartAuth. The evaluation results indicate that our HomeShield realization can successfully defend against over 90% attack trials with an average latency of less than 1 s.
Employing Artificial Intelligence techniques to address challenges in requirements elicitation is gaining traction. Although nine systematic literature reviews have been published on AI-based solutions in the requirem...
详细信息
Employing Artificial Intelligence techniques to address challenges in requirements elicitation is gaining traction. Although nine systematic literature reviews have been published on AI-based solutions in the requirements elicitation domain, to our knowledge, these studies do not cover a broad spectrum of elicitation tasks, data sources used for training, the performance of these algorithms, nor do they pinpoint the strengths and limitations of the algorithms used. This study contributes to the field by presenting a systematic literature review that explores the use of machinelearning and NLP techniques in the elicitation phase of requirements engineering. The following research questions are addressed: 1) What elicitation tasks are supported by AI and what AI algorithms were employed? 2) What data sources have been used to construct AI-based solutions? 3) What performance outcomes were achieved? 4) What are the strengths and limitations of the current AI methods? Initially, 665 papers were retrieved from six data sources, and ultimately, 122 articles were selected for the review. This literature review identifies fifteen elicitation tasks currently supported by artificial intelligence and presents twelve publicly available data sources used for training these approaches. Furthermore, the study uncovers common limitations in current studies and suggests potential research directions. Overall, this systematic literature review provides insights into future research prospects for applying AI techniques to problems in the requirements elicitation domain.
暂无评论