This work proposes a real-time sentiment analysis pipeline on customer feedback using Yelp and addresses the high-volume dynamic user-generated contents processing problem. The proposal integrates state-of-the-art mac...
详细信息
Modern materials science research problems present a challenge to data science and analytics as experiments generate Petabyte-scale spatiotemporal datasets that span a number of modalities and formats. Creating comput...
详细信息
ISBN:
(纸本)9798350383225
Modern materials science research problems present a challenge to data science and analytics as experiments generate Petabyte-scale spatiotemporal datasets that span a number of modalities and formats. Creating computing infrastructure and frameworks that support the scale and diversity of materials science data while remaining accessible for materials scientists to use is a non-trivial task. We have developed the Common Research Analytics and Data Lifecycle Environment (CRADLE) to solve the challenges of materials data science through a scalable research computing framework and cyber infrastructure that can (1) handle large-scale, heterogeneous datasets (2) provide a flexible toolbox for building machine learning pipelines that span from ingestion to model deployment (3) be accessible to research scientists with limited to extensive computational backgrounds and (4) utilize a myriad of low performance to high performance computer systems. CRADLE is a framework that integrates distributedsystems like Hadoop and High-Performance computing (HPC) infrastructure to handle materials data at scale. This all enables the general materials data scientist to query Petabytes of data and train thousands of models in a parallel, distributed environment. We demonstrate three use cases for CRADLE to benchmark its capability to ingest and analyze spatiotemporal materials data at scale. These tasks span three data modalities: transforming 2.6 billion Photovoltaic time-series power measurements, training hundreds of deep learning models on Atomic Force Microscopy images, and ingesting 27 billion geospatial data points. CRADLE exemplifies an overarching framework that accelerates time to science, extends to other domains with similar challenges, and expands the horizon of data science and research.
The fast growth of the Internet of Things devices and communication protocols poses equal opportunities for lifestyle-boosting services and pools for cyber attacks. Usually, IoT network attackers gain access to a larg...
详细信息
ISBN:
(纸本)9798350304831
The fast growth of the Internet of Things devices and communication protocols poses equal opportunities for lifestyle-boosting services and pools for cyber attacks. Usually, IoT network attackers gain access to a large number of IoT (e.g., things and fog nodes) by exploiting their vulnerabilities to set up attack armies, then attacking other devices/nodes in the IoT network. The distributed Denial of Service (DDoS) flooding-attacks are prominent attacks on IoT. DDoS concerns security professionals due to its nature in forming sophisticated attacks that can be bandwidth-busting. DDoS can cause unplanned IoT-services outages, hence requiring prompt and efficient DDoS mitigation. In this paper, we propose a DDoS-FOCUS;a solution to mitigate DDoS attacks on fog nodes. The solution encompasses a machine learning model implanted at fog nodes to detect DDoS attackers. A hybrid deep learning model was developed using Conventional Neural Network and Bidirectional LSTM (CNN-BiLSTM) to mitigate future DDoS attacks. A preliminary test of the proposed model produced an accuracy of 99.8% in detecting DDoS attacks.
The proceedings contain 42 papers. The topics discussed include: hierarchical heterogeneous cluster systems for scalable distributed deep learning;streamlining CPS validation: using interoperable UML tools for seamles...
ISBN:
(纸本)9798350371284
The proceedings contain 42 papers. The topics discussed include: hierarchical heterogeneous cluster systems for scalable distributed deep learning;streamlining CPS validation: using interoperable UML tools for seamless model exchange;multivariate LSTM for execution time prediction in HPC for distributed deep learning training;multi-criteria optimization of distributed real-time network topologies;securing real-time systems using schedule reconfiguration;an explainable method for cost-efficient multi-view fall detection;enhanced RF-based 3D UAV outdoor geolocation: from trilateration to machine learning approaches;real-time embedded monitoring technologies in modern healthcare systems: a survey;security assessment solutions for IoT devices;and intrusion detection schemes based on synthetic minority oversampling technique and machine learning models.
The InterPlanetary File System (IPFS) is a popular decentralized peer-to-peer network for exchanging data. While there are many use cases for IPFS, the success of these use cases depends on the network. In this paper,...
详细信息
ISBN:
(数字)9781665488792
ISBN:
(纸本)9781665488792
The InterPlanetary File System (IPFS) is a popular decentralized peer-to-peer network for exchanging data. While there are many use cases for IPFS, the success of these use cases depends on the network. In this paper, we provide a passive measurement study of the IPFS network, investigating peer dynamics and curiosities of the network. With the help of our measurement, we estimate the network size and confirm the results of previous active measurement studies.
Broadband cryogenic traveling-wave parametric amplifiers are vital components for the readout of superconducting qubits. This is due to their large gain and quantum-limited noise performance. In this paper, we present...
详细信息
ISBN:
(纸本)9798350348194;9798350348187
Broadband cryogenic traveling-wave parametric amplifiers are vital components for the readout of superconducting qubits. This is due to their large gain and quantum-limited noise performance. In this paper, we present analytic solutions for the equivalent added input fluctuations of state-of-the-art traveling-wave parametric amplifiers based on superconducting nonlinear asymmetric elements derived from a quantum mechanical model, including losses and dispersion.
Delegating large-scale computations to service providers is a common practice which raises privacy concerns. This paper studies information-theoretic privacy-preserving delegation of data to a service provider, who ma...
详细信息
ISBN:
(纸本)9798350382853;9798350382846
Delegating large-scale computations to service providers is a common practice which raises privacy concerns. This paper studies information-theoretic privacy-preserving delegation of data to a service provider, who may further delegate the computation to auxiliary worker nodes, in order to compute a polynomial over that data at a later point in time. We study techniques which are compatible with robust management of distributed computation systems, an area known as coded computing. Privacy in coded computing, however, has traditionally addressed the problem of colluding workers, and assumed that the server that administrates the computation is trusted. This viewpoint of privacy does not accurately reflect real-world privacy concerns, since normally, the service provider as a whole (i.e., the administrator and the worker nodes) form one cohesive entity which itself poses a privacy risk. This paper aims to shift the focus of privacy in coded computing to safeguarding the privacy of the user against the service provider as a whole, instead of merely against colluding workers inside the service provider. To this end, we leverage the recently defined notion of perfect subset privacy, which guarantees zero information leakage from all subsets of the data up to a certain size. Using known techniques from Reed-Muller decoding, we provide a scheme which enables polynomial computation with perfect subset privacy in stragglerfree systems. Furthermore, by studying information super-sets in Reed-Muller codes, which may be of independent interest, we extend the previous scheme to tolerate straggling worker nodes inside the service provider.
Tensor factorization plays a fundamental role in multiple areas of AI research. Nevertheless, it encounters significant challenges related to privacy breaches and operational efficiency. In this study, we propose a no...
详细信息
Along with constructing new power systems and the Energy Internet, much power data is generated and stored at the edge devices, which may contain customer privacy and be challenging to use. Besides, the lack of comput...
详细信息
ISBN:
(纸本)9798350381993;9798350382006
Along with constructing new power systems and the Energy Internet, much power data is generated and stored at the edge devices, which may contain customer privacy and be challenging to use. Besides, the lack of computing power and the untrustworthy environment in the edge layer prevent further development of the data. How to effectively and securely utilize the accumulated power data and facilitate the construction of new power systems and power business transformation is now a significant challenge for the State Grid Corporation of China (SGCC). This paper proposes a new data sharing architecture based on federated learning and blockchain technology, which exchanges the data model instead of the original data to ensure data security with a privacy protection protocol and a lightweight consensus for edge devices. Meanwhile, an incentive mechanism is established to encourage high-quality data share and training. In conclusion, the architecture has approximate accuracy with acceptable privacy cost compared to traditional methods with high privacy preservation It can be widely used in big data scenarios in smart distribution or metering areas with medium-sized network scenarios.
With the rapid development of artificial intelligence technology, its application in the optimization of complex computer systems is becoming more and more extensive. Edge computing is an efficient distributed computi...
详细信息
暂无评论