As data volumes continue to surge, the resources required to accelerate dataprocessing for model training remain a challenge, leading to increased costs and extended processing times. This paper presents a novel meth...
详细信息
ISBN:
(纸本)9798350362770;9798350362763
As data volumes continue to surge, the resources required to accelerate dataprocessing for model training remain a challenge, leading to increased costs and extended processing times. This paper presents a novel method that combines Adaptive Sampling with an automated, self-adaptive comprehensive test suite module to address these challenges. This approach maintains model accuracy and ensures coverage of essential data required for business use cases. Experiments conducted on diverse datasets, some as large as several hundred terabytes, demonstrate that this method can reduce processing times by up to 75%. This achievement is realized by efficiently identifying and processing only representative samples. These results demonstrate the promising potential for improving dataprocessing efficiency in model training for Artificial Intelligence (AI) applications across diverse sectors.
Analysis and utilization of massive meter data can help decision-makers provide reasonable decisions. Therefore, multi-functional meter dataprocessing has received considerable attention in recent years. Nevertheless...
详细信息
Analysis and utilization of massive meter data can help decision-makers provide reasonable decisions. Therefore, multi-functional meter dataprocessing has received considerable attention in recent years. Nevertheless, it might compromise users' privacy, such as releasing users' lifestyles and habits. In this paper, we propose an efficient and privacy-preserving massive data process for smart grids. The presented protocol utilizes the Paillier homomorphic encryption and Horner's Rule to achieve a privacy-preserving two-level random permutation method, making large-scale meter data permuted randomly and sufficiently in a privacy-preserving way. As a result, the analysis center can simultaneously implement various dataprocessing functions (such as variance, comparing, linear regression analysis), and it does not know the source of data. The security analysis shows that our protocol can realize data confidentiality and data source anonymity. The detailed analyses demonstrate that our protocol is efficient in terms of computational and communication costs. Furthermore, it can support fault tolerance of entity failures and has flexible system scalability.
The Internet of Things (IoT) has seen a surge in mobile devices with the market and technical expansion. IoT networks provide end-to-end connectivity while keeping minimal latency. To reduce delays, efficientdata del...
详细信息
The Internet of Things (IoT) has seen a surge in mobile devices with the market and technical expansion. IoT networks provide end-to-end connectivity while keeping minimal latency. To reduce delays, efficientdata delivery schemes are required for dispersed fog-IoT network orchestrations. We use a Spark-based big dataprocessing scheme (BDPS) to accelerate the distributed database (RDD) delay efficient technique in the fogs for a decentralized heterogeneous network architecture to reinforce suitable data allocations via IoTs. We propose BDPS based on Spark-RDD in fog-IoT overlay architecture to address the performance issues across the network orchestration. We evaluate dataprocessing delays from fog-IoT integrated parts using a depth-first-search-based shortest path node finding configuration, which outperforms the existing shortest path algorithms in terms of algorithmic (i.e., depth-first search) efficiency, including the Bellman-Ford (BF) algorithm, Floyd-Warshall (FW) algorithm, Dijkstra algorithm (DA), and Apache Hadoop (AH) algorithm. The BDPS exhibits low latency in packet deliveries as well as low network overhead uplink activity through a map-reduced resilient data distribution mechanism, better than in BF, DA, FW, and AH. The overall BDPS scheme supports efficientdata delivery across the fog-IoT orchestration, outperforming faster node execution while proving effective results, compared to DA, BF, FW and AH, respectively.
This study explores the ferromagnetic semiconducting nature of Na2ReX6 (X = Cl and Br) double perovskites utilizing WIEN2k code. The lattice constant increases from 9.88, and 10.52 & Aring;, and computed elastic c...
详细信息
This study explores the ferromagnetic semiconducting nature of Na2ReX6 (X = Cl and Br) double perovskites utilizing WIEN2k code. The lattice constant increases from 9.88, and 10.52 & Aring;, and computed elastic constants, Poisson, and Pugh's ratio validate their mechanical stability. In addition, the electronic band structure demonstrates a ferromagnetic semiconducting nature with a large band gap of 3.8 and 3.0 eV, and the value of the magnetic moment is witnessed as 3.0 mu B which is suitable for their spintronic applications. The ZT value is noticed as 0.75 and 0.76 illustrating their potential for thermoelectric applications.
Car accidents remain a leading cause of unintentional fatalities, with many incidents stemming from driver behaviors that impact vehicle control, such as steering, braking, accelerating, and gear shifting. Activities ...
详细信息
Car accidents remain a leading cause of unintentional fatalities, with many incidents stemming from driver behaviors that impact vehicle control, such as steering, braking, accelerating, and gear shifting. Activities like searching for items, using mobile devices, or listening to the radio can distract drivers visually, audibly, and physically, posing significant risks to road safety. While various methods have been developed to detect such distractions, their effectiveness often falls short in real-world applications. This paper introduces a novel approach that combines machine learning (ML) and deep learning (DL) techniques to identify both safe and risky driving behaviors. Six ML classifiers were evaluated on real-world data to distinguish between driving behaviors such as aggressive, fatigued, and normal driving, with the Random Forest classifier demonstrating superior performance. Additionally, a specialized deep-learning baseline model was developed using ResNet50 and efficientNetB6 to classify driving-related images into distinct categories. The hybrid model integrates ML for analyzing tabular data and DL for image recognition, achieving a classification accuracy of 99.3% on the UAH-Drive dataset. Deep learning experiments further revealed that the Base Model outperformed other models, achieving accuracies of 99.32% on the UAH-Drive dataset and 99.87% on the SFD3 dataset. This research presents a robust hybrid ML-DL framework for detecting abnormal driving behaviors, addressing shortcomings of existing techniques in real-world conditions, and offering valuable insights for improving road safety and reducing accidents.
Large-scale HPC simulations of plasma dynamics in fusion devices require efficient parallel I/O to avoid slowing down the simulation and to enable the post-processing of critical information. Such complex simulations ...
详细信息
ISBN:
(纸本)9798350383461;9798350383454
Large-scale HPC simulations of plasma dynamics in fusion devices require efficient parallel I/O to avoid slowing down the simulation and to enable the post-processing of critical information. Such complex simulations lacking parallel I/O capabilities may encounter performance bottlenecks, hindering their effectiveness in data-intensive computing tasks. In this work, we focus on introducing and enhancing the efficiency of parallel I/O operations in Particle-in-Cell Monte Carlo simulations. We first evaluate the scalability of BIT1, a massivelyparallel electrostatic PIC MC code, determining its initial write throughput capabilities and performance bottlenecks using an HPC I/O performance monitoring tool, Darshan. We design and develop an adaptor to the openPMD I/O interface that allows us to stream PIC particle and field information to I/O using the BP4 backend, aggressively optimized for I/O efficiency, including the highly efficient ADIOS2 interface. Next, we explore advanced optimization techniques such as data compression, aggregation, and Lustre file striping, achieving write throughput improvements while enhancing data storage efficiency. Finally, we analyze the enhanced high-throughput parallel I/O and storage capabilities achieved through the integration of openPMD with rapid metadata extraction in BP4 format. Our study demonstrates that the integration of openPMD and advanced I/O optimizations significantly enhances BIT1's I/O performance and storage capabilities, successfully introducing high throughput parallel I/O and surpassing the capabilities of traditional file I/O.
Detecting outliers in data is essential in various fields, such as finance, healthcare, and many other domains with anomalies. Among well-known outlier detection algorithms, Local Outlier Factor (LOF) is widely used f...
详细信息
ISBN:
(纸本)9783031509582;9783031509599
Detecting outliers in data is essential in various fields, such as finance, healthcare, and many other domains with anomalies. Among well-known outlier detection algorithms, Local Outlier Factor (LOF) is widely used for identifying unusual data points. However, the computational time of LOF significantly increases when dealing with large datasets containing numerical and categorical features. We propose an innovative approach using block size optimisation to speed up the outlier detection process while maintaining high accuracy. By optimizing the block size, we achieve a significant improvement in LOF's performance without compromising its effectiveness. Experiment results on diverse datasets containing mixed categorical and numerical features demonstrate the effectiveness of our method in accelerating outlier detection while retaining high detection accuracy. This advancement in outlier detection has the potential to improve decision-making processes. It empowers the timely identification of anomalous events, which is significant in critical applications, including cybersecurity.
暂无评论