Concept index (CI) is a very fast and efficient feature extraction (FE) algorithm for text classification. The key approach in CI scheme is to express each document as a function of various concepts (centroids) ...
详细信息
Concept index (CI) is a very fast and efficient feature extraction (FE) algorithm for text classification. The key approach in CI scheme is to express each document as a function of various concepts (centroids) present in the collection. However,the representative ability of centroids for categorizing corpus is often influenced by so-called model misfit caused by a number of factors in the FE process including feature selection to similarity measure. In order to address this issue, this work employs the "DragPushing" Strategy to refine the centroids that are used for concept index. We present an extensive experimental evaluation of refined concept index (RCI) on two English collections and one Chinese corpus using state-of-the-art Support Vector Machine (SVM) classifier. The results indicate that in each case, RCI-based SVM yields a much better performance than the normal CI-based SVM but lower computation cost during training and classification phases.
In this article, we propose a new multiscale sliding window model which differentiates data items in different time periods of the data stream, based on a reasonable monotonicity of resolution assumption. Our model, a...
详细信息
The Level 1 Muon Trigger subsystem for BTeV will be implemented using the same architectural building blocks as the BTeV Level 1 Pixel Trigger: pipelined field programmable gate arrays feeding a farm of dedicated proc...
详细信息
The Level 1 Muon Trigger subsystem for BTeV will be implemented using the same architectural building blocks as the BTeV Level 1 Pixel Trigger: pipelined field programmable gate arrays feeding a farm of dedicated processing elements. The muon trigger algorithm identifies candidate tracks, and is sensitive to the muon charge (sign);candidate dimuon events are identified by complementary charge track-pairs. To insure that the trigger is operating effectively, the trigger development team is actively collaborating in an independent multi-university research program for reliable, self-aware, fault adaptive behavior in real-time embedded systems (RTES). Key elements of the architecture, algorithm, performance, and engineered reliability are presented.
A distributed computing system consists of heterogeneous computing devices, communication networks, operating system services, and applications. As organisations move toward distributed computing environments, there w...
详细信息
A distributed computing system consists of heterogeneous computing devices, communication networks, operating system services, and applications. As organisations move toward distributed computing environments, there will be a corresponding growth in distributed applications central to the enterprise. The design, development, and management of distributed applications presents many difficult challenges. As these systems grow to hundreds or even thousands of devices and similar or greater magnitude of software components, it will become increasingly difficult to manage them without appropriate support tools and frameworks. Further, the design and deployment of additional applications and services will be, at best, ad hoc without modelling tools and timely data on which to base design and configuration decisions. This paper presents a framework for management of distributed applications and systems. The framework is based on a set of common management services that support management activities. The services include monitoring, control, configuration, and data repository services. A prototype system built on the framework is described that implements and integrates management applications providing visualisation, fault location, performance monitoring and modelling, and configuration management. The prototype also demonstrates how various management services can be implemented.
This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallel computer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the comb...
详细信息
This paper describes the implementation of transmission-line matrix (TLM) method algorithms on a massively parallel computer (DECmpp 12000), the technique of distributed computing in the UNIX environment, and the combination of TLM analysis with Prony's method as well as with autoregressive moving average (ARMA) digital signal processing for electromagnetic field modelling. By combining these advanced computation techniques, typical electromagnetic field modelling of microwave structures by TLM analysis can be accelerated by a few orders of magnitude.
Agent-based modeling (ABM) is increasing its popularity and is applied to practical simulation where millions of agents need to interact with each other over a large-scale logical space. Cluster computing is an approa...
详细信息
Agent-based modeling (ABM) is increasing its popularity and is applied to practical simulation where millions of agents need to interact with each other over a large-scale logical space. Cluster computing is an approach to accommodating ABM’s needs of both CPU and spatial scalability. This research compares three parallel ABM libraries such as FLAME, Repast HPC, and our MASS C++ libraries, all modeling simulation programs in C/C++ and running them in parallel over a cluster system. Our comparative work selects seven benchmark programs from social, behavioral, and economic sciences, biology, and urban planning; parallelizes them with each of these three libraries; analyzes their programmability through the parallelization; and measures their parallel performance. Our results reach two findings. The programmability of each ABM library has different pros and cons in metrics such as total lines of code (LoC), boilerplate percentages, agent/space modeling and management LoC, lack of cohesion of methods (LCOM), ease of agent synchronizations, and semantically smooth coding. Therefore, there is no all-in-one ABM library for best programming any application domains. However, ABM parallel executions are heavily affected by each library’s design principles. In particular, FLAME’s frequent file accesses and message broadcasts as well as Repast HPC’s central agent managements incur system overheads or bottlenecks. These performance drawbacks give MASS C++ an advantage in performing fastest and scaling up simulation most successfully.
暂无评论