A document is often full of class-independent "general" words and short of class-specific "core" words, which leads to the difficulty of document clustering. We argue that both problems will be rel...
详细信息
ISBN:
(纸本)9780769527017
A document is often full of class-independent "general" words and short of class-specific "core" words, which leads to the difficulty of document clustering. We argue that both problems will be relieved after suitable smoothing of document models in agglomerative approaches and of cluster models in partitional approaches, and hence improve clustering quality. To the best of our knowledge, most model-based clustering approaches use Laplacian smoothing to prevent zero probability while most similarity-based approaches employ the heuristic TF*IDF scheme to discount the effect of "general" words. Inspired by a series of statistical translation language model for text retrieval, we propose in this paper a novel smoothing method referred to as context-sensitive semantic smoothing for document clustering purpose. The comparative experiment on three datasets shows that model-based clustering approaches with semantic smoothing is effective in improving cluster quality.
This paper presents a visual approach to the representation and validation of multimedia document structures specified in XML and transformation of one structure to another. The underlying theory of our approach is a ...
详细信息
ISBN:
(纸本)0780371984
This paper presents a visual approach to the representation and validation of multimedia document structures specified in XML and transformation of one structure to another. The underlying theory of our approach is a context-sensitive graph grammar formalism. ne paper demonstrates the conciseness and expressiveness of the graph grammar formalism. An example XML structure is provided and its graph grammar representation, validation and transformation to a multimedia representation are presented.
As documentation is more and more built directly into the interface, and as technical communicators move into areas of interface design and usability if is important to have a theoretical framework within which to mak...
详细信息
ISBN:
(纸本)0914548956
As documentation is more and more built directly into the interface, and as technical communicators move into areas of interface design and usability if is important to have a theoretical framework within which to make decisions about what kind of information should be conveyed at any moment.
An approach for enforcing constraints between program entities and their documentary comments is presented. The approach uses srcML to represent Java source code and introduces an XML format, namely srcDoc, for markin...
详细信息
ISBN:
(纸本)9780769528601
An approach for enforcing constraints between program entities and their documentary comments is presented. The approach uses srcML to represent Java source code and introduces an XML format, namely srcDoc, for marking up Javadoc-style comments. The enforced constraints are specified with a combination of XML and XQuery. An Eclipse plugin is described that demonstrates the use of XML and related technologies to express and enforce constraints on documentary comments. Examples of constraints enforcing design rationale for methods in an API are shown.
A real‐time BASIC executive has been developed for process measurement and control. This paper describes two features of the executive which demonstrate that some of BASIC'S constraints are unnecessarily restrict...
详细信息
For businesses, localization is essential to increasing overseas revenue. Going beyond translation and providing an effective localization for each target country is the key to successful global marketing. Project man...
详细信息
This paper describes Haddock, a tool for automatically generating documentation from Haskell source code. Haddock's unique approach to source code annotations provides a useful separation between the implementatio...
详细信息
ISBN:
(纸本)1581136056
This paper describes Haddock, a tool for automatically generating documentation from Haskell source code. Haddock's unique approach to source code annotations provides a useful separation between the implementation of a library and the interface (and hence also the documentation) of that library, so that as far as possible the documentation annotations in the source code do not affect the programmer's freedom over the structure of the implementation. The internal structure and implementation of Haddock is also discussed.
Collaborative document processing has been addressed by many approaches so far, most of which focus on document versioning and collaborative editing. We address this issue from a different angle and describe the conce...
详细信息
ISBN:
(纸本)1595931759
Collaborative document processing has been addressed by many approaches so far, most of which focus on document versioning and collaborative editing. We address this issue from a different angle and describe the concept and architecture of a pervasive document editing and managing system. It exploits database techniques and real-time updating for sophisticated collaboration scenarios on multiple devices. Each user is always served with upto-date documents and can organize his work based on document meta data. For this, we present our conceptual architecture for such a system and discuss it with an example. Copyright 2005 ACM.
Adaptive documents undergo many transformations during their generation, including insertion and deletion of content. One major problem in this scenario is the preservation of the aesthetic qualities of the document d...
详细信息
ISBN:
(纸本)1595935150
Adaptive documents undergo many transformations during their generation, including insertion and deletion of content. One major problem in this scenario is the preservation of the aesthetic qualities of the document during those transformations. As adaptive documents are instances of a template, the aesthetic quality of an instance with respect to the template could be evaluated by aesthetic measures providing scores to any desired quality parameters. These parameters measure the deviation of the instance from the desired template. This evaluation could assure the quality of instances during their generation and final output. This paper introduces the use of document templates to support aesthetic measures of document instances. A score is assigned to a document instance according to the differences detected from the original template. Considering the original template as an ideal result, the quality of a document instance will decrease according to the number and severity of the changes applied to produce it. So, documents that are below a given threshold can be sent for further (possibly human) review, and any others are accepted. The amount of change with respect to the template will reflect the document quality, and in such a model the quality of instances can be considered as a distance from that original. Copyright 2006 ACM.
A series of computer programs has been developed for use in a field data analysis system. This paper describes the planning, development, and implementation of procedures needed by designers. Some of the important fea...
详细信息
A series of computer programs has been developed for use in a field data analysis system. This paper describes the planning, development, and implementation of procedures needed by designers. Some of the important features discussed are the determination of the designer needs, the proper design of the program output, programming in modular form, and user oriented documentation of the programs. Specific program output examples are also illustrated to demonstrate the concepts of the development philosophy.
暂无评论