检索结果-内蒙古大学图书馆

7th ACM international conference on web search and data mining (WSDM)

作者： Moens, Marie-Francine Vulic, Ivan Katholieke Univ Leuven Dept Comp Sci Leuven Belgium

Multilingual topic models are a fairly novel group of unsupervised, language-independent and generative machine learning models. this tutorial covers all key aspects of their probabilistic framework and demonstrates h... 详细信息

ISBN: (纸本)9781450323512

关键词： multilingual data mining probabilistic topic modeling cross-lingual text processing comparable data multilingual topic models

来源：评论

学校读者我要写书评

暂无评论

Dynamic Miss-Counting algorithms: Finding implication and similarity rules with confidence pruning

Proceedings - International Conference on Data Engineering

引用

Proceedings - international conference on data Engineering 2000年 501-511页

作者： Fujiwara, Shinji Ullman, Jeffrey D. Motwani, Rajeev Stanford Univ United States

Dynamic Miss-Counting algorithms are proposed, which find all implication and similarity rules with confidence pruning but without support pruning. To handle data sets with a large number of columns, we propose dynamic pruning techniques that can be applied during data scanning. DMC counts the numbers of rows in which each pair of columns disagree instead of counting the number of hits. DMC deletes a candidate as soon as the number of misses exceeds the maximum number of misses allowed for that pair. We also propose several optimization techniques that reduce the required memory size significantly. We evaluated our algorithms by using 4 data sets, i.e., web access logs, web page-link graph, News documents, and a Dictionary. these data sets have between 74,000 and 700,000 items. Experiments show that DMC can find high-confidence rules for such a large data sets efficiently.

关键词： data mining

来源：评论

学校读者我要写书评

暂无评论

Proactive Conversational Agents 23

Proactive Conversational Agents

引用

16th international conference on web search and data mining

作者： Liao, Lizi Yang, Grace Hui Shah, Chirag Singapore Management Univ Singapore Singapore Georgetown Univ Washington DC USA Univ Washington Seattle WA USA

ISBN: (纸本)9781450394079

Conversational agents, or commonly known as dialogue systems, have gained escalating popularity in recent years. their widespread applications support conversational interactions with users and accomplishing various tasks as personal assistants. However, one key weakness in existing conversational agents is that they only learn to passively answer user queries via training on pre-collected and manually-labeled data. Such passiveness makes the interaction modeling and system-building process relatively easier, but it largely hinders the possibility of being human-like hence lowering the user engagement level. In this tutorial, we introduce and discuss methods to equip conversational agents with the ability to interact with end users in a more proactive way. this three-hour tutorial is divided into three parts and includes two interactive exercises. It reviews and presents recent advancements on the topic, focusing on automatically expanding ontology space, actively driving conversation by asking questions or strategically shifting topics, and retrospectively conducting response quality control.

关键词： Proactive conversational agents task-oriented dialogue systems conversational search conversational AI

来源：评论

学校读者我要写书评

暂无评论

Workshop on Two-sided Marketplace Optimization: search, Pricing, Matching & Growth 18

Workshop on Two-sided Marketplace Optimization: Search, Pric...

引用

11th ACM international conference on web search and data mining

作者： Grbovic, Mihajlo Noulas, thanasis Airbnb Inc 888 Brannan St San Francisco CA 94103 USA

ISBN: (纸本)9781450355810

the 1st international Workshop on Two-sided Marketplace Optimization: search, Pricing, Matching & Growth (TSMO) will be held in Los Angeles, California, USA on February 9th, 2018, co-located with the 11th ACM international conference on web search and data mining (WSDM). the main objective of the workshop is to address the challenges of two-sided marketplace optimization in web-scale settings. the workshop brings together interdisciplinary researchers in information retrieval, recommender systems, personalization, and related areas, to share, exchange, learn, and develop preliminary results, new concepts, ideas, principles, and methodologies on applying data mining technologies to marketplace optimization. We have constructed an exciting program papers and invited talks that will help us better understand the future of two-sided marketplaces

关键词： search Ranking Smart Pricing User Modeling Personalization

来源：评论

学校读者我要写书评

暂无评论

Considerations for Ethical Speech Recognition datasets 23

Considerations for Ethical Speech Recognition Datasets

引用

16th international conference on web search and data mining

作者： Papakyriakopoulos, Orestis Xiang, Alice Sony AI Schlieren Switzerland Sony AI New York NY USA

ISBN: (纸本)9781450394079

Speech AI Technologies are largely trained on publicly available datasets or by the massive web-crawling of speech. In both cases, data acquisition focuses on minimizing collection effort, without necessarily taking the data subjects' protection or user needs into consideration. this results to models that are not robust when used on users who deviate from the dominant demographics in the training set, discriminating individuals having different dialects, accents, speaking styles, and disfluencies. In this talk, we use automatic speech recognition as a case study and examine the properties that ethical speech datasets should possess towards responsible AI applications. We showcase diversity issues, inclusion practices, and necessary considerations that can improve trained models, while facilitating model explainability and protecting users and data subjects. We argue for the legal & privacy protection of data subjects, targeted data sampling corresponding to user demographics & needs, appropriate meta data that ensure explainability & accountability in cases of model failure, and the sociotechnical & situated model design. We hope this talk can inspire researchers & practitioners to design and use more human-centric datasets in speech technologies and other domains, in ways that empower and respect users, while improving machine learning models' robustness and utility.

关键词： automated speech recognition human-centric AI data collection algorithmic fairness

来源：评论

学校读者我要写书评

暂无评论

Vote-and-Comment: Modeling the Coevolution of User Interactions in Social Voting web Sites 16

Vote-and-Comment: Modeling the Coevolution of User Interacti...

引用

16th IEEE international conference on data mining (ICDM)

作者： Costa, Alceu Ferraz Machado Traina, Agma Juci Traina, Caetano, Jr. Faloutsos, Christos Univ Sao Paulo Sao Paulo Brazil Carnegie Mellon Univ Pittsburgh PA 15213 USA

ISBN: (纸本)9781509054732

In social voting web sites, how do the user actions - up-votes, down-votes and comments - evolve over time? Are there relationships between votes and comments? What is normal and what is suspicious? these are the questions we focus on. We analyzed over 20,000 submissions corresponding to more than 100 million user interactions from three social voting web sites: Reddit, Imgur and Digg. Our first contribution is two discoveries: (i) the number of comments grows as a power-law on the number of votes and (ii) the time between a submission creation and a user's reaction obeys a log-logistic distribution. Based on these patterns, we propose VNC (VOTE-AND-COMMENT), a parsimonious but accurate and scalable model that models the coevolution of user activities. In our experiments on real data, VNC outperformed state-of-the-art baselines on accuracy. Additionally, we illustrate VNC usefulness for forecasting and outlier detection.

关键词： websites

来源：评论

学校读者我要写书评

暂无评论

Report on the 10th international workshop on web Information and data Management (WIDM)

引用

SIGMOD Record 2009年第2期38卷 50-52页

作者： Chan, Chee-Yong Polyzotis, Neoklis National University of Singapore Singapore Singapore University of California-Santa Cruz United States

the 10th ACM international Workshop on web Information and data Management (WIDM 2008), which was held in Napa Valley, California, the US, in conjunction with the 17th international conference on Information and Knowledge Management (CIKM), on October 30, 2008, focused on how web information can be extracted, stored, analyzed, and processed to provide useful knowledge to the end users for various advanced database applications. the papers presented at the workshop were grouped in the following subject areas, namely, data mining and clustering, systems issues, web 2.0 and social networks, and ranking and similarity search. One paper entitled Event Detection with Common User Interests focused on the problem of identifying events that can be detected through the publication of online documents and the search queries posed over said documents. Nereau: Query Expansion Using Social Bookmark presented a new approach to enhance query expansion with personalization by exploiting tag information from social bookmarking services.

关键词： data mining

来源：评论

学校读者我要写书评

暂无评论

What You Will Gain By Rounding: theory and Algorithms for Rounding Rank 16

What You Will Gain By Rounding: Theory and Algorithms for Ro...

引用

16th IEEE international conference on data mining (ICDM)

作者： Neumann, Stefan Gemulla, Rainer Miettinen, Pauli Univ Vienna Vienna Austria Univ Mannheim Data & Web Sci Grp Mannheim Germany Max Planck Inst Informat Saarland Informat Campus Saarbrucken Germany

ISBN: (纸本)9781509054732

When factorizing binary matrices, we often have to make a choice between using expensive combinatorial methods that retain the discrete nature of the data and using continuous methods that can be more efficient but destroy the discrete structure. Alternatively, we can first compute a continuous factorization and subsequently apply a rounding procedure to obtain a discrete representation. But what will we gain by rounding? Will this yield lower reconstruction errors? Is it easy to find a low-rank matrix that rounds to a given binary matrix? Does it matter which threshold we use for rounding? Does it matter if we allow for only non-negative factorizations? In this paper, we approach these and further questions by presenting and studying the concept of rounding rank. We show that rounding rank is related to linear classification, dimensionality reduction, and nested matrices. We also report on an extensive experimental study that compares different algorithms for finding good factorizations under the rounding rank model.

关键词： Matrix algebra

来源：评论

学校读者我要写书评

暂无评论

Self-organizing map based web pages clustering using web logs

Self-organizing map based web pages clustering using web log...

引用

16th international conference on Software Engineering and data Engineering, SEDE 2007

作者： Qi, Dehu Li, Chung-Chih Computer Science Department Lamar University PO Box 10056 Beaumont TX 77710 United States School of Information Technology Illinois State University Campus Box 5150 Normal IL 61790 United States

ISBN: (纸本)9781604231847

A web-based business always wants to have the ability to track users' browsing behavior history. this ability can be achieved by using web log mining technologies. In this paper, we introduce a Self-Organizing Map (SOM) based approach to mining web log data. the SOM network maps the web pages into a two-dimensional map based on the users' browsing history. web pages with the similar browsing patterns are clustered together. Together with associate rules, the cluster generated by the SOM network has significant meaning to web browsing behavior. the experimental results demonstrate the feasibility and the effectiveness of this approach.

关键词： Self organizing maps

来源：评论

学校读者我要写书评

暂无评论

webApriori: A web Application for Association Rules mining 16th

WebApriori: A Web Application for Association Rules Mining

引用

16th international conference on Intelligent Tutoring Systems (ITS)

作者： Malliaridis, Konstantinos Ougiaroglou, Stefanos Dervos, Dimitris A. Int Hellen Univ Dept Informat & Elect Engn Thessaloniki 57400 Greece

ISBN: (纸本)9783030496630;9783030496623

this paper presents a web application for Association Rules mining (ARM). It utilizes Apriori that is the most widely used algorithm for this type of data mining tasks. the web application is called webApriori and offers a modern responsive web interface and a web service to scientific communities working in the field of ARM. It is also appropriate for educational purposes. webApriori implements an Apriori engine that can efficiently discover the hidden associations in data and it is capable to process different types of datasets. Part of the process involves the removal of redundant associations rules. the asynchronous communication between the front-end, back-end, web service and Apriori engine layers efficiently handles multiple concurrent user requests.

关键词： Association rules Apriori web application web service

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：