A data mining component is included in microsoft sql server 2000 and sqlserver 2005, one of the most popular DBMSs. This gives a push for data mining technologies to move from a niche towards the mainstream. Apart fr...
详细信息
A data mining component is included in microsoft sql server 2000 and sqlserver 2005, one of the most popular DBMSs. This gives a push for data mining technologies to move from a niche towards the mainstream. Apart from a few algorithms, the main contribution of sqlserver Data Mining is the implementation of OLE DB for Data Mining. OLE DB for Data mining is an industrial standard led by microsoft and supported by a number of ISVs. It leverages two existing relational technologies: sql and OLE DB. It defines a sql language for data mining based on a relational concept. More recently, microsoft, Hyperion, SAS and a few other BI vendors formed the XML for Analysis Council. XML for Analysis covers both OLAP and Data Mining. The goal is to allow consumer applications to query various BI packages from different platforms. This paper gives an overview of OLE DB for Data Mining and XML for Analysis. It also shows how to build data mining application using these APIs.
We propose a model for errors in sung queries, a variant of the hidden Markov model (HMM). This is a solution to the problem of identifying the degree of similarity between a (typically error-laden) sung query and a p...
详细信息
We propose a model for errors in sung queries, a variant of the hidden Markov model (HMM). This is a solution to the problem of identifying the degree of similarity between a (typically error-laden) sung query and a potential target in a database of musical works, an important problem in the field of music information retrieval. Similarity metrics are a critical component of "query-by-humming" (QBH) applications which search audio and multimedia databases for strong matches to oral queries. Our model comprehensively expresses the types of error or variation between target and query: cumulative and noncumulative local errors, transposition, tempo and tempo changes, insertions, deletions and modulation. The model is not only expressive, but automatically trainable, or able to learn and generalize from query examples. We present results of simulations, designed to assess the discriminatory potential of the model, and tests with real sung queries, to demonstrate relevance to real-world applications.
microsoft's sqlserver Web Services Toolkit (WSTK), which is used to build web services for relational databases, is discussed. The toolkit allows construction of XML views of relational data stored in sqlserver ...
详细信息
microsoft's sqlserver Web Services Toolkit (WSTK), which is used to build web services for relational databases, is discussed. The toolkit allows construction of XML views of relational data stored in sqlserver and query/update the relational data through these views. Users can then request this data as XML, and WSTK retrieves relational rowsets from the database, converting them to XML hierarchies on-the-fly transparently to users. The sqlserver Web Services Toolkit lets developers access databases using programming models that are natural to client-side programming languages, allowing databases to be easily converted into web services.
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candid...
详细信息
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt an Apriori-like candidate set generation-and-test approach. However, candidate set generation is still costly, especially when there exist a large number of patterns and/or long patterns. In this study, we propose a novel frequent-pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develop an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth. Efficiency of mining is achieved with three techniques: (1) a large database is compressed into a condensed, smaller data structure, FP-tree which avoids costly, repeated database scans, (2) our FP-tree-based mining adopts a pattern-fragment growth method to avoid the costly generation of a large number of candidate sets, and (3) a partitioning-based, divide-and-conquer method is used to decompose the mining task into a set of smaller tasks for mining confined patterns in conditional databases, which dramatically reduces the search space. Our performance study shows that the FP-growth method is efficient and scalable for mining both long and short frequent patterns, and is about an order of magnitude faster than the Apriori algorithm and also faster than some recently reported new frequent-pattern mining methods.
Music Information Retrieval has become an active area of research motivated by the increasing importance of Internet-based music distribution. In December 2003, Apple Computer announced it was selling almost 1.5 milli...
详细信息
Music Information Retrieval has become an active area of research motivated by the increasing importance of Internet-based music distribution. In December 2003, Apple Computer announced it was selling almost 1.5 million music downloads per week (***/pr/library/2003/dec/***), and some analysts predict that downloads will account for 33 percent of the music industry's sales by 2008 (Zeidler 2003). Online catalogs are already approaching one million songs, so it is important to study new techniques for searching these vast stores of *** approach to finding music that has received much attention is Query-by-Humming (QBH). This approach enables users to retrieve songs and information about them by singing, humming, or whistling a melodic fragment. In QBH systems, the query is a digital audio recording of the user, and the ultimate target is a complete digital audio recording. The audio waveforms of the query will have little or no direct similarity to those of the target audio recording, so QBH systems always search using some other representation. Most commonly, this representation is a sequence of notes described by pitch and duration. It is possible to transcribe monophonic queries into note sequences (although accurate transcription of the monophonic voice is still an active research area). Polyphonic target music, however, cannot be automatically transcribed into melodies. Therefore, most QBH systems assume that a MIDI or symbolic representation is available from which a note sequence can be *** system uses a database consisting of standard MIDI files, so one can state the QBH problem as follows: "Given a user's audio query, find matching melodies in a database of standard MIDI files." Once the melody is identified, the QBH system might offer links to audio files, ring tones, album titles, card catalog information, sheet music, or other useful *** many researchers have investigated related problems and have built prototype sy
sqlserver 7.0 offers three different styles of replication that we call Transactional Replication, Snapshot Replication, and Merge Replication. Merge Replication means that data changes can be performed at any replic...
详细信息
ISBN:
(纸本)1581130848
sqlserver 7.0 offers three different styles of replication that we call Transactional Replication, Snapshot Replication, and Merge Replication. Merge Replication means that data changes can be performed at any replica, and that the changes performed at multiple replicas are later merged together. Because Merge Replication allows updates to disconnected replicas, it is particularly well suited to applications that require a lot of autonomy. A special process called the Merge Agent propagates changes between replicas, filters data as appropriate, and detects and handles conflicts according to user-specified rules.
暂无评论