Estimating conversion rate (CVR) accurately has been one of the most central problems in online advertising. Existing methods in production focus on learning effective interactions among features to boost the model pe...
详细信息
Object detection in complex environments presents significant challenges due to the variability in object scales, occlusions, and cluttered backgrounds. This paper proposes a Multi-Scale Learning Assisted Neural Netwo...
详细信息
We introduce a framework to design in-memory decision tree machine-learning (ML) circuits using memristor crossbars. Decision trees (DTs) offer many advantages over neural networks, such as enhanced energy efficiency,...
详细信息
ISBN:
(数字)9798350330991
ISBN:
(纸本)9798350331004
We introduce a framework to design in-memory decision tree machine-learning (ML) circuits using memristor crossbars. Decision trees (DTs) offer many advantages over neural networks, such as enhanced energy efficiency, interpretability, safety, privacy, and speed, along with reduced dependence on extensive training data. We propose an adaptive multivariate decision tree (AMDT) training algorithm, which constructs decision trees that incorporate both univariate and multivariate features, facilitating the creation of higher accuracy and energy-efficient crossbar designs compared to the state-of-the-art (SOTA). Our circuits are realized using pure memristor crossbars, requiring just one memristor per cell and no transistors while employing sneak-paths for flow-based in-memory computations. In comparison to the SOTA, our approach produces designs that are, on average, 4% more accurate and require 12.6% lower energy.
This study analyzes air pollution in Asian cities using the Global Air Pollution Data, consisting of 6,196 entries from 31 countries. Our primary goal is to identify pollution patterns through multivariate analysis an...
详细信息
ISBN:
(数字)9798350391213
ISBN:
(纸本)9798350391220
This study analyzes air pollution in Asian cities using the Global Air Pollution Data, consisting of 6,196 entries from 31 countries. Our primary goal is to identify pollution patterns through multivariate analysis and evaluate the effectiveness of six clustering algorithms: K-Means, Hierarchical Clustering, DBSCAN, Gaussian Mixture Models (GMM), Agglomerative Clustering, and Spectral Clustering. Performance was assessed using Silhouette Score, Davies-Bouldin Index, Calinski-Harabasz Index, WCSS, Cohesion, and Separation. The novelty of this work lies in the comparative analysis of these clustering methods on air pollution data, providing new insights into pollution dynamics across Asian cities. The analysis identified four distinct clusters- ‘High Pollution’, ‘Moderate Pollution’, ‘Ozone-Dominated Pollution’, and ‘Low Pollution’- with K-Means proving to be the most effective. Significant disparities were found, particularly in South and East Asia, where countries like India, China, and Pakistan exhibited the highest pollution levels. Additionally, an examination of capital cities revealed specific pollution patterns and the primary pollutants-PM2.5, NO2, CO, and Ozone-aiding in identifying sources and affected regions. These findings underscore the need for targeted regional pollution control strategies.
Inverse design, where we seek to design input variables in order to optimize an underlying objective function, is an important problem that arises across fields such as mechanical engineering to aerospace engineering....
详细信息
Synthetic Aperture Radar (SAR) imaginary is used extensively for Military applications in the modern era of technology. In this work, we have evaluated the performance of SAR images for Radar Systems. The SAR length, ...
详细信息
Millimeter waves (mmWaves) providing higher bandwidth is used by 5G network technology to achieve higher network capacity and faster data transfer. However, the process of beam sweeping across multiple antenna arrays ...
详细信息
In the landscape of rapidly expanding data streams, deriving meaningful insights, particularly frequent itemsets, within massive datasets at a swift pace poses a significant challenge. Addressing this challenge, vario...
In the landscape of rapidly expanding data streams, deriving meaningful insights, particularly frequent itemsets, within massive datasets at a swift pace poses a significant challenge. Addressing this challenge, various methodologies for Frequent Itemset Mining have been proposed, yet they struggle with low support counts and efficiency concerns in handling large datasets. In response, our research introduces Jagged Itemset Counting (JIC) methodologies, aiming to effectively mine Frequent Itemsets from extensive data. The core objective revolves around devising a robust algorithm capable of identifying all Frequent Itemsets, irrespective of database size or the nature of the itemset. Central to this approach is the introduction of a straightforward label representation, GPLN (Geometric Progression Label Number), assigned to each frequent item. Utilizing CGPLN (Cumulative Geometric Progression Label Number), derived from the arithmetic sum of GPLNs within transaction subsets, forms the CGPLN-Label representation for each transaction subset (itemset). Comparative analysis reveals superior performance of JIC over Apriori and Eclat algorithms for small and medium-sized datasets, exhibiting efficiency even at minimal support thresholds. In the realm of Big Data, where FP-Growth and Eclat falter, the proposed technique shines with faster execution times, optimized main memory utilization, and efficient disc memory usage.
A data-driven application, Yelp has always served and will always serve as one. It is one of the first companies that allows local businesses to receive reviews from their customers. Communities have always developed ...
详细信息
ISBN:
(数字)9798350359299
ISBN:
(纸本)9798350359305
A data-driven application, Yelp has always served and will always serve as one. It is one of the first companies that allows local businesses to receive reviews from their customers. Communities have always developed it in cooperation with each other. According to Yelp, the data-set contains information regarding businesses, reviews, users, and check-ins, and has continuously been updated since 2015. A project titled "Analyzing Yelp Open-Source Data-set in Azure Data Bricks" will analyze this data-set which has been made available as an open source for sentiment analysis and descriptive analysis. This study analyzes local business performance, business distribution, review ratings, and other factors, as well as check-in rates in American business locations over time. This analysis shows that the Yelp is losing reviews, tips, elite users, and check-ins.
Smart Healthy Schools (SHS) are a new paradigm in building engineering and infection risk control in school buildings where the disciplines of Indoor Air Quality (IAQ), IoT (Internet of Things) and Artificial Intellig...
详细信息
暂无评论