The machine learning and data science center (MLC) was established in April 2013 as a research and development hub of big data analysis technologies at NTT laboratories with the aim of creating innovative services fro...
详细信息
The machine learning and data science center (MLC) was established in April 2013 as a research and development hub of big data analysis technologies at NTT laboratories with the aim of creating innovative services from a wide variety of big data. MLC uses machinelearning and data mining technologies cultivated by NTT laboratories and a parallel-distributed processing platform (Jubatus) for high-efficiency and real-Time data analysis to develop diverse big data analysis technologies and support big data services. This article introduces these big data activities at MLC.
Video anomaly detection (VAD) is a demanding task because the very definition of anomalies in videos is inherently inconclusive and also due to the high manpower required to supervise lengthy videos. This research pap...
详细信息
Urban land cover classification aims to derive crucial information from earth observation data and categorize it into specific land uses. To achieve accurate classification, sophisticated machinelearning models train...
详细信息
When a Knowledge Discovery from data (KDD) (Fayyad, Piatetsky-Shapiro, & Smyth, 1996) process is being applied to get knowledge, several methods could be used (Gibert, et al., 2018). A simple and fast way to obtai...
详细信息
ISBN:
(纸本)9781643685434
When a Knowledge Discovery from data (KDD) (Fayyad, Piatetsky-Shapiro, & Smyth, 1996) process is being applied to get knowledge, several methods could be used (Gibert, et al., 2018). A simple and fast way to obtain preliminary insights from data before using KDD models is by generating a basic descriptive analysis. It is one of the most popular ways to describe experimental data and should be the beginning of all data projects. Nevertheless some of the main knowledge that can be extracted in a descriptive analysis is hidden due to underlying multivariate structures which could be elicited through multivariate analysis techniques. Moreover, the domain expert is key for a proper interpretation of descriptive results. At the same time, there is a lack of automatic reporting techniques that can report and help in the interpretation of complex patterns and the use of advanced multivariate techniques. This paper shows the tool developed to generate automatic interpretation of Multiple Correspondence Analysis (MCA) and Principal Components Analysis (PCA) by using RMarkdown. This tool generates a Word document which contains the automatic interpretation of the results, built on the basis of regular expressions ellaborating over the R analytical outputs (either numerical or graphical results). The proposal is being applied with some real data, like INSESS database on social vulnerabilities of the Catalan population. In conclusion, the developed tool contributes to facilitate the factorial methods results, avoiding the misinterpretation of the results and the involuntary skipping of conclusions due to the large amount of knowledge that can be extracted from a complete factorial analysis. Also, this software enables non-expert users to read multivariate analysis results in a friendly way. Moreover, this tool saves time in the interpretation step and is a basis to support the expert to start the report with the results, even the output of the software could become the report or
The primary aim of identifying the binding motifs in gene regulation is to understand the transcriptional regulation molecular mechanism systematically. In this study, the (, d) motif search issue was considered ...
详细信息
Community detection is an essential tool for unsupervised data exploration and revealing the organisational structure of networked systems. With a long history in network science, community detection typically relies ...
This paper introduces SEAN, a novel anomaly detection algorithm designed for real-time applications in predictive maintenance. SEAN leverages an ensemble-based approach to deliver competitive performance while drastic...
详细信息
Large Language Models (LLMs) show remarkable performance on a wide variety of tasks. Most LLMs split text into multi-character tokens and process them as atomic units without direct access to individual characters. Th...
详细信息
In this paper, we propose 1-bit weighted Σ quantization schemes of mixed order as a technique for digital halftoning. These schemes combine weighted Σ schemes of different orders for two-dimensional signals so one c...
详细信息
Hierarchical clustering has usually been addressed by discrete optimization using heuristics or continuous optimization of relaxed scores for hierarchies. In this work, we propose to optimize expected scores under a p...
暂无评论