With the advances in machinelearning, lie detection technology gained significant attention. In recent years, several multi-modal techniques achieved as high as 99% accuracy results using the Real-life Trial dataset ...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
With the advances in machinelearning, lie detection technology gained significant attention. In recent years, several multi-modal techniques achieved as high as 99% accuracy results using the Real-life Trial dataset with only 121 data points. This led to considerable media hype and research interest in lie detection with machinelearning. In this paper, we analyze the effect of dataset bias in deception detection. More specifically, we train a classifier to predict the sex of the identity appearing in the video. On a testdata point, we use the sex predictor to predict sex which we use as a proxy for predicting deception, predicting lie for females and truth for males. This lie predictor simulates a classifier that uses nothing but dataset bias. Nevertheless, we find that the performance of this biased classifier is comparable to those of state-of-the-art papers. More specifically, when using IDT features, our biased classifier achieves 64.6% and 59.3% AUC while a classifier trained normally on truth/lie labels achieves 57.4% accuracy and 69.3% AUC. We perform similar experiments on the Bag-of-Lies dataset and show that it too is biased with respect to sex. In addition, we apply the state-of-the-art techniques on an unbiased dataset and show that their performance is no better than chance. Our experiments strongly suggest that the results of recent deception detection techniques can be explained by the bias inherent in the datasets.
In machinelearning, scale adds complexity. The most obvious consequence of scale is that data takes longer to process. At certain points, however, scale makes trivial operations costly, thus forcing us to re-evaluate...
详细信息
Detecting neologisms is essential in real-time natural language processing applications. Not only can it enable to follow the lexical evolution of languages, but it is also essential for updating linguistic resources ...
详细信息
A method for detecting defects in forming mesh based on the YOLOV8N model is proposed. Firstly, an illumination system is designed, and then image data with defects is collected to construct an image dataset. The cons...
详细信息
ISBN:
(纸本)9798400707032
A method for detecting defects in forming mesh based on the YOLOV8N model is proposed. Firstly, an illumination system is designed, and then image data with defects is collected to construct an image dataset. The constructed image dataset is annotated and augmented to build a corresponding image sample set. The YOLOV8N model is trained using the sample set to obtain a defect recognition and localization model. The original image to be recognized is input into the defect recognition and localization model, which then outputs the corresponding defect recognition and localization results. Experimental results show that this method provides accurate localization and fast speed.
This paper presents a set of methods for the analysis of user activity and data preparation for the music recommender by the example of "Odnoklassniki"(1) social network. The history of actions is being anal...
详细信息
ISBN:
(纸本)9783319089799;9783319089782
This paper presents a set of methods for the analysis of user activity and data preparation for the music recommender by the example of "Odnoklassniki"(1) social network. The history of actions is being analyzed in multiple dimensions in order to find a number of collaborative and temporal correlations as well as to make the overall rankings. The results of the analysis are being exported in a form of a taste graph which is then used to generate on-line music recommendations. The taste graph displays relations between different entities connected with music (users, tracks, artists, etc.) and consists of the following main parts: user preferences, track similarities, artists' similarities, artists' works and demography profiles.
The proceedings contain 8 papers. The topics discussed include: fully decentralized computation of aggregates over datastreams;detecting outliers on arbitrary datastreams using anytime approaches;CALDS: context-awar...
ISBN:
(纸本)9781450302265
The proceedings contain 8 papers. The topics discussed include: fully decentralized computation of aggregates over datastreams;detecting outliers on arbitrary datastreams using anytime approaches;CALDS: context-aware learning from datastreams;evolutionary clustering using frequent itemsets;towards subspace clustering on dynamic data: an incremental version of PreDeCon;visual analysis of news streams with article threads;conformal prediction for distribution-independent anomaly detection in streaming vessel data;and research issues in mining multiple data.
At present, the world is experiencing the second information wave with data as the core and the Internet as the means, and the society is shifting from IT (information) era to DT (data) era. The planning of substation...
详细信息
ISBN:
(纸本)9798400707032
At present, the world is experiencing the second information wave with data as the core and the Internet as the means, and the society is shifting from IT (information) era to DT (data) era. The planning of substation location is mainly based on the existing satellite vector data through the software simulation platform, and the optimal scheme and path planning are often based on the experience of designers in actual design and erection. However, with the continuous expansion of the project scale and increasing complexity, it has brought great challenges to the design work. Therefore, substation location needs to be combined with current data fusion technology and artificial intelligence technology to effectively improve the intelligent level of substation location.
Software quality attributes can be identified based on software features such as security, reliability and user-friendliness. This process can be done either manually or automatically. Sentiment analysis refers to the...
详细信息
ISBN:
(纸本)9781467362719
Software quality attributes can be identified based on software features such as security, reliability and user-friendliness. This process can be done either manually or automatically. Sentiment analysis refers to the sentiment extraction task from resources such as natural language texts. We study the application of sentiment analysis on extracting the quality attributes of a software product based on the opinions of end-users that have been stated in microblogs such as Twitter. Our findings obtain advantageous techniques such as document frequency of words in a large number of tweets. The extracted results can help software developers know the advantages and disadvantages of their products.
The field of graph datamining, one of the most important AI research areas, has been revolutionized by graph neural networks (GNNs), which benefit from training on real-world graph data with millions to billions of n...
详细信息
ISBN:
(纸本)9781450392365
The field of graph datamining, one of the most important AI research areas, has been revolutionized by graph neural networks (GNNs), which benefit from training on real-world graph data with millions to billions of nodes and links. Unfortunately, the training data and process of GNNs involving graphs beyond millions of nodes are extremely costly on a centralized server, if not impossible. Moreover, due to the increasing concerns about data privacy, emerging data from realistic applications are naturally fragmented, forming distributed private graphs of multiple "data silos", among which direct transferring of data is forbidden. The nascent field of federated learning (FL), which aims to enable individual clients to jointly train their models while keeping their local data decentralized and completely private, is a promising paradigm for large-scale distributed and private training of GNNs. FedGraph2022 aims to bring together researchers from different backgrounds with a common interest in how to extend current FL algorithms to operate with graph data models such as GNNs. FL is an extremely hot topic of large commercial interest and has been intensively explored for machinelearning with visual and textual data. The exploration from graph mining researchers and industrial practitioners is timely catching up just recently. There are many unexplored challenges and opportunities, which urges the establishment of an organized and open community to collaboratively advance the science behind it. The prospective participants of this workshop will include researchers and practitioners from both graph mining and federated learning communities, whose interests include, but are not limited to: graph analysis and mining, heterogeneous network modeling, complex datamining, large-scale machinelearning, distributed systems, optimization, meta-learning, reinforcement learning, privacy, robustness, explainability, fairness, ethics, and trustworthiness.
Cheiloscopy is a forensic investigation technique that deals with identification of humans based on lips traces. Lip traces hold multifarious features and could be analyzed in different ways to identify the links with...
详细信息
暂无评论