With the internet enduring to engage in a momentous part in our lives, it is crucial to prioritize maintaining inclusive and positive interactions on internet platforms. With the rise in internet usage, platforms like...
详细信息
ISBN:
(数字)9798331521349
ISBN:
(纸本)9798331521356
With the internet enduring to engage in a momentous part in our lives, it is crucial to prioritize maintaining inclusive and positive interactions on internet platforms. With the rise in internet usage, platforms like Twitter have become a hub for individuals to freely express their thoughts and engage with others. However, this increased user base has also led to an increase in cyber bullying comments which could include hate speech, harassment, or assault language. Sentiment analysis comprises consuming natural language processingmethods to consider the sentimentality conveyed with a slice of text, like as a comment or a tweet. It can help identify whether a comment is positive, negative, or neutral, allowing platforms to take appropriate action. This paper focusses on Machine Learning Models to deploy sentiment analysis to classify cyber bullying comments and promoting positive interactions on internet platforms. By leveraging this technology, platforms like Twitter can take proactive measures to adopt a protected and more comprehensive online environment, ultimately enhancing the user experience for everyone involved.
As the amount of user data increase, the computer performance and I/O speed required for data processing and analysis are getting higher and higher. distributed file system has become the primary option for big data s...
详细信息
ISBN:
(纸本)9781450389280
As the amount of user data increase, the computer performance and I/O speed required for data processing and analysis are getting higher and higher. distributed file system has become the primary option for big data storage and query. According to the characteristics of high dimensionality and sparseness of data, this paper uses the distributed storage idea of CMD (coordinate modulo distribution) to store data in blocks. We only need to use cheap storage devices to form a distributed storage system, which solves the problem of big data disk I/O read performance to a certain extent. We have improved range query function under the CMD storage method;at the same time, the optimized B+ tree index technology has been used to solve the precise search problem of sparse data. Finally, in view of the unbalanced distribution of different sub-node data, we propose a new data rebalancing method on the CMD storage method.
The proceedings contain 44 papers. The special focus in this conference is on Advances in Signal processing and Communication Engineering. The topics include: Single-Precision Floating-Point Multiplier Design Using Qu...
ISBN:
(纸本)9789811955495
The proceedings contain 44 papers. The special focus in this conference is on Advances in Signal processing and Communication Engineering. The topics include: Single-Precision Floating-Point Multiplier Design Using Quantum-Dot Cellular Automata with Power Dissipation Analysis;compression Techniques for Low Power Hardware Accelerator Design: Case Studies;sequence Set Design for Radar System;design of an All Digital Phase-Locked Loop Using Cordic Algorithm;analysis of Deep Learning Algorithms for image Denoising;an Extensive Survey on Assessment of Multicore Processors for Embedded Systems;image Segmentation Techniques and Optimization Algorithms for Lung Cancer Detection;handwritten to Text Document Converter;An Efficient Energy Aware for Reliable Route Discovery Using Energy with Movement Detection Technique in MANET;Lumped Circuit Modeling at Nanoscale (Part-II: Coupling Between Two Nanospheres);design and Implementation of Imprecise Adders for Low-Power Applications;speech Processed Public Addressing System;an Approach Towards Data Privacy Issues in distributed Cyber Physical System;Classification of LPI Radar Signals Using Multilayer Perceptron (MLP) Neural Networks;a Systematic Review on Screening of Diabetic Retinopathy and Maculopathy Using Artificial Intelligence;Area Efficient and High-Throughput Radix-4 1024-Point FFT Processor for DSP Applications;air Quality Monitoring System Based on Artificial Intelligence;an Improved Technique for Identification of Forgery image Detection Using Clustering Method;EEG Signal Analysis During Stroop Task for Checking the Effect of Sleep Deprivation;UWB Localization Procedures with Range Control methods—A Review;deep Learning Model for Multiclass Classification of Diabetic Retinal Fundus images Using Gradient Descent Optimization;multilevel Authentication to Wireless Sensor Networks Against Malicious Attacks Using Butterfly Method.
In computer vision deep learning (DL) tasks, most of the input image datasets are stored in the JPEG format. These JPEG datasets need to be decoded before DL tasks are performed on them. We observe two problems in the...
详细信息
ISBN:
(纸本)9781728190747
In computer vision deep learning (DL) tasks, most of the input image datasets are stored in the JPEG format. These JPEG datasets need to be decoded before DL tasks are performed on them. We observe two problems in the current JPEG decoding procedures for DL tasks: (1) the decoding of image entropy data in the decoder is performed sequentially, and this sequential decoding repeats with the DL iterations, which takes significant time;(2) Current parallel decoding methods under-utilize the massive hardware threads on GPUs. To reduce the image decoding time, we introduce a pre-scan mechanism to avoid the repeated image scanning in DL tasks. Our pre-scan generates boundary markers for entropy data so that the decoding can be performed in parallel. To cooperate with the existing dataset storage and caching, systems, we propose two modes of the pre-scan mechanism: a compatible mode and a fist mode. The compatible mode does not change the image file structure so pre-scanned files can be stored back to disk for subsequent DL tasks. In comparison, the fast mode crafts a JPEG image into a binary format suitable for parallel decoding, which can be processed directly on the GPU. Since the GPU has thousands of hardware threads, we propose a fine-grained parallel decoding method on the pre-scanned dataset. The fine-grained parallelism utilizes the GPU effectively, and achieves speedups of around 1.5x over existing GPU-assisted image decoding libraries on real-world DL tasks.
In Federated Learning (FL), two-way model exchanges are required between the server and the workers every training round. Due to the large size of machine learning models, communications between them lead to high trai...
详细信息
In Federated Learning (FL), two-way model exchanges are required between the server and the workers every training round. Due to the large size of machine learning models, communications between them lead to high training delay and economic cost. At present, communication-efficient FL methods, for examples, top-k sparsification and quantization, taking advantages of the sparseness of model gradients and the fact that gradient-based model updating can tolerance small deviations, effectively reduce the communication cost of single training round. However, these gradient-based communication-efficient schemes cannot be applied to downlink communication. In addition, they cannot be used in conjunction with those communicationfrequency-suppressed methods, e.g., FedAvg, which hinders them from further improving training efficiency. In this paper, we propose FedCS, a compressive sensing based FL method, which can effectively compress and accurately reconstruct non-sparse model (both local and global) parameters (iveights), and can reduce the overall communication cost up to 10 $\times$ as compared to FedAvg without decreasing test accuracy. We introduce 1) a dictionary learning scheme with a quasi-validation set, which helps to project non-sparse parameters onto a sparse domain; 2) ajoint reconstruction scheme, by using which the server recovers global model parameters by executing the reconstruction algorithm only once a round, regardless of the number of compressed local models; 3) a compression ratio adjustment strategy, which balances the trade-off between total communication cost and model accuracy. We perform FedCS on three image classification tasks, and compare it with FedAvg, FedPAQ and T-FedAvg (two improvements of FedAvg). Experimental results demonstrate that FedCS outperforms comparison methods in all tasks, and always maintains a comparable test accuracy to FedAvg, even using a small quasi-validation set and on Non-IId data.
In urban areas, geographic spread of spatial networks continually produces massive data. In fact, the analysis of these spatial networks is squarely related to the partitioning of networks into sub-networks in a balan...
In urban areas, geographic spread of spatial networks continually produces massive data. In fact, the analysis of these spatial networks is squarely related to the partitioning of networks into sub-networks in a balanced manner using scalable analysis methods with the aim of minimizing processing time, namely, partitioning parallel and distributed with (Big Data, ML). Our objective, consist to partition in a distributed way a large spatial network into several Subnets. Therefore, we are conducting a comparative study of distributed partitioning algorithms such as Clustering and Hierarchical Clustering methods with the method that we have proposed in another paper [26] allowing to decompose the spatial network into sub-networks taking into account the Euclidean proximity between the nodes and the different Voronoï generators. the comparative study of the three partitioning methods that we have implemented, namely, Clustering, Hierarchical Clustering and our proposed method, have allowed us to deduce that our method is the most reliable and competitive compared to others, in terms of balance partitions as well as in terms of optimizing the partitioning time.
The proceedings contain 16 papers. The special focus in this conference is on Signal and imageprocessing. The topics include: Deep Convolutional Neural Network-Based Diagnosis of Invasive Ductal Carcinoma;speaker Ide...
ISBN:
(纸本)9789813369658
The proceedings contain 16 papers. The special focus in this conference is on Signal and imageprocessing. The topics include: Deep Convolutional Neural Network-Based Diagnosis of Invasive Ductal Carcinoma;speaker Identification in Spoken Language Mismatch Condition: An Experimental Study;Ultrasound image Classification Using ACGAN with Small Training Dataset;preface;Chaotic Ions Motion Optimization (CIMO) for Biological Sequences Local Alignment: COVID-19 as a Case Study;assessment of Eyeball Movement and Head Movement Detection Based on Reading;using Hadoop Ecosystem and Python to Explore Climate Change;a Brief Review of Intelligent Rule Extraction Techniques;the Effect of Different Feature Selection methods for Classification of Melanoma;intelligent Hybrid Technique to Secure Bluetooth Communications;parallel Algorithm to find Integer k where a given Well-distributed Graph is k-Metric Dimensional;a Fog-Based Retrieval of Real-Time Data for Health Applications;differential Evolution-Based Shot Boundary Detection Algorithm for Content-Based Video Retrieval;qutrit-Based Genetic Algorithm for Hyperspectral image Thresholding.
A lot of progress has been made since the first neural network models were trained for specific image restoration tasks, such as super-resolution and denoising. Recently multi-degradation models have been proposed, al...
详细信息
This contribution describes a novel method for visualizing sound sources using a rotating linear array of a few digital microphones. The rotating array scans the incident sound field on a circular area. While the redu...
详细信息
Mining cohesive subgraphs on bipartite graphs is an important task. The k-bitruss is one of many popular cohesive subgraph models, which is the maximal subgraph where each edge is contained in at least k butterflies. ...
详细信息
Mining cohesive subgraphs on bipartite graphs is an important task. The k-bitruss is one of many popular cohesive subgraph models, which is the maximal subgraph where each edge is contained in at least k butterflies. The bitruss decomposition problem is to find all k-bitrusses for k >= 0. Dealing with large graphs is often beyond the capability of a single machine due to its limited memory and computational power, leading to a need for efficiently processing large graphs in a distributed environment. However, all current solutions are for a single machine and a centralized environment, where processors can access the graph or auxiliary indexes randomly and globally. It is difficult to directly deploy such algorithms on a shared-nothing model. In this paper, we propose distributed algorithms for bitruss decomposition. We first propose SC-HBD as the baseline, which uses H-function to define bitruss numbers and computes them iteratively to a fix point in parallel. We then introduce a subgraph-centric peeling method SC-PBD, which peels edges in batches over different butterfly complete subgraphs. We then introduce local indexes on each fragment, study the butterfly-aware edge partition problem including its hardness, and propose an effective partitioner. Finally we present the bitruss butterfly-complete subgraph concept, and divide and conquer DC-BD method with optimization strategies. Extensive experiments show the proposed methods solve graphs with 30 trillion butterflies in 2.5 hours, while existing parallelmethods under shared-memory model fail to scale to such large graphs.
暂无评论