Biomedical Named Entity (NE) recognition is a core technique for various works in the biomedical domain. In previous studies, using machinelearning algorithm shows better performance than dictionary-based and rule ba...
详细信息
Federated learning (FL) allows training machinelearning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing. This approach raises several challenge...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Federated learning (FL) allows training machinelearning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing. This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity. In particular, the presence of highly non-i.i.d. data severely impairs both the performance of the trained neural network and its convergence rate, increasing the number of communication rounds requested to reach a performance comparable to that of the centralized scenario. As a solution, we propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, Le. superclients, to emulate the centralized paradigm in a privacy-compliant way. Given a fixed budget of communication rounds, we show that FedSeq outperforms or match several state-of-the-art federated algorithms in terms of final performance and speed of convergence. Finally, our method can be easily integrated with other approaches available in the literature. Empirical results show that combining existing algorithms with FedSeq further improves its final performance and convergence speed. We test our method on CIFAR-10 and CIFAR-100 and prove its effectiveness in both i.i.d. and non-i.i.d. scenarios.(1)
data scientists in software engineering seek insight in data collected from software projects to improve software development. The demand for data scientists with domain knowledge in software development is growing ra...
详细信息
ISBN:
(纸本)9781467330763
data scientists in software engineering seek insight in data collected from software projects to improve software development. The demand for data scientists with domain knowledge in software development is growing rapidly and there is already a shortage of such data scientists. data science is a skilled art with a steep learning curve. To shorten that learning curve, this workshop will collect best practices in form of data analysis patterns, that is, analyses of data that leads to meaningful conclusions and can be reused for comparable data. In the workshop we compiled a catalog of such patterns that will help experienced data scientists to better communicate about data analysis. The workshop was targeted at experienced data scientists and researchers and anyone interested in how to analyze data correctly and efficiently in a community accepted way.
The proceedings contain 7 papers. The topics discussed include: how to teach a computer to learn about microbes: KG-COVID-19 and microbial graph learning;explaining multivariate time series forecasts: an application t...
The proceedings contain 7 papers. The topics discussed include: how to teach a computer to learn about microbes: KG-COVID-19 and microbial graph learning;explaining multivariate time series forecasts: an application to predicting the Swedish GDP;towards participatory design spaces for explainable ai interfaces in expert domains;teaching AI to explain its decisions can affect class balance;foundations for solving classification problems with quantitative abstract argumentation;sequential exceptional pattern discovery using pattern-growth: an extensible framework for interpretable machinelearning on sequential data;and a comparative study of explainer modules applied to automated skin lesion classification.
The proceedings contain 9 papers. The topics discussed include: why do sports officials dropout?;strategic patterns discovery in RTS-games for e-sport with sequential patternmining;maps for reasoning in ultimate;pred...
The proceedings contain 9 papers. The topics discussed include: why do sports officials dropout?;strategic patterns discovery in RTS-games for e-sport with sequential patternmining;maps for reasoning in ultimate;predicting the NFL using Twitter;use of performance metrics to forecast success in the national hockey league;finding similar movements in positional datastreams;comparison of machinelearning methods for predicting the recovery time of professional football players after an undiagnosed injury;predicting NCAAB match outcomes using ML techniques – some results and lessons learned;and key point selection and clustering of swimmer coordination through sparse Fisher-EM.
The proceedings contain 12 papers. The topics discussed include: validation of mixed-structured data using patternmining and information extraction;validation of knowledge-based systems through CommonKADS;composing t...
The proceedings contain 12 papers. The topics discussed include: validation of mixed-structured data using patternmining and information extraction;validation of knowledge-based systems through CommonKADS;composing tactical agents through contextual storyboards;an adaptable e-learning system for pupils with specific learning difficulties;decision-maker-aware design of descriptive datamining;validation of a datamining method for optimal university curricula;model-based software development - perspectives and challenges;a test case generation technique and process;from natural language requirements to a conceptual model;test case reduction methods by using CBR;and evolution support for model-based development and testing - summary.
In recent years, high-throughput genome sequencing and sequence analysis technologies have created the need for automated annotation and analysis of large sets of genes. The Gene Ontology (GO) provides a common contro...
详细信息
ISBN:
(纸本)9783540689706
In recent years, high-throughput genome sequencing and sequence analysis technologies have created the need for automated annotation and analysis of large sets of genes. The Gene Ontology (GO) provides a common controlled vocabulary for describing gene function however the process for annotating proteins with GO terms is usually through a tedious manual curation process by trained professional annotators. With the wealth of genomic data that are now available, there is a need for accurate automated annotation methods. In this paper, we propose a method for automatically predicting GO terms for proteins by applying statistical patternrecognition techniques. We employ protein functional domains as features and learn independent Support Vector machine classifiers for each GO term. This approach creates sparse data sets with highly imbalanced class distribution. We show that these problems can be overcome with standard feature and instance selection methods. We also present a meta-learning scheme that utilizes multiple SVMs trained for each GO term, resulting in improved overall performance than either SVM can achieve alone. The implementation of the tool is available at http://***/AAPFC.
This article explores the usefulness of the depth images provided by the current Microsoft Kinect sensors in different face analysis tasks including identity, gender and ethnicity. Four local feature extraction method...
详细信息
This article explores the usefulness of the depth images provided by the current Microsoft Kinect sensors in different face analysis tasks including identity, gender and ethnicity. Four local feature extraction methods (LBP, LPQ, HOG and BSIF) are investigated for both face texture and shape description. Extensive experiments on three publicly available Kinect face databases are reported. The experimental analysis yields into interesting findings. Furthermore, a comprehensive review of the literature on the use of Kinect depth data in face analysis is provided along with the description of the available databases. (C) 2015 Elsevier B.V. All rights reserved.
The proceedings contain 8 papers. The topics discussed include: a hybrid grid-based method for mining arbitrary regions-of-interest from trajectories;ensemble feature ranking for shellfish farm closure cause identific...
ISBN:
(纸本)9781450323697
The proceedings contain 8 papers. The topics discussed include: a hybrid grid-based method for mining arbitrary regions-of-interest from trajectories;ensemble feature ranking for shellfish farm closure cause identification;clustering household electricity use profiles;predicting petroleum reservoir properties from downhole sensor data using an ensemble model of neural networks;light-weight online predictive data aggregation for wireless sensor networks;and performance analysis of duty-cycling wireless sensor networks for train localization.
Feature selection plays a central role in data analysis and is also a crucial step in machinelearning, datamining and patternrecognition. Feature selection algorithm focuses mainly on the design of a criterion func...
详细信息
ISBN:
(纸本)3540225552
Feature selection plays a central role in data analysis and is also a crucial step in machinelearning, datamining and patternrecognition. Feature selection algorithm focuses mainly on the design of a criterion function and the selection of a search strategy. In this paper, a novel feature selection approach (NFSA) based on quantum genetic algorithm (QGA) and a good evaluation criterion is proposed to select the optimal feature subset from a large number of features extracted from radar emitter signals (RESs). The criterion function is given firstly. Then, detailed algorithm of QGA is described and its performances are analyzed. Finally, the best feature subset is selected from the original feature set (OFS) composed of 16 features of RESs. Experimental results show that the proposed approach reduces greatly the dimensions of OFS and heightens accurate recognition rate of RESs, which indicates that NFSA is feasible and effective.
暂无评论