In today's era of big data, huge amounts of web contains many important information. From the web to extract domain-specific term is an indispensable part of the natural language processing web, and it also plays ...
详细信息
In today's era of big data, huge amounts of web contains many important information. From the web to extract domain-specific term is an indispensable part of the natural language processing web, and it also plays an important role in the domain ontology study. Chinese text has no evident difference between words, therefore the present stage in web text extraction is difficult in the field of Chinese text. this article will put forward to more accurately extract the Chinese text. First by removing stop words, Chinese word segmentation, lexical analysis to extract the nouns and noun phrases as candidate field terms. then according to the candidate term in the field of subject in the field of distribution, the distribution of the subject areas each page, and terms in the distribution of other background areas. Combination of subject areas and background areas, using both TF-IDF and DR + DC algorithm terminology and implementing the term extraction in the field of subject, based on the Chinese word segmentation system of Chinese Academy of Sciences (ICTCLAS) and Language Technology Platform Cloud of Harbin Institute of Technology (LTP) [15] two platform tools to implement the term extraction, so that extract more accurate domain terminology.
the tragic events of September 11th have caused drastic effects on many aspects of society. Academics in the fields of computational and information science have been called upon to help enhance the governmentpsilas a...
the tragic events of September 11th have caused drastic effects on many aspects of society. Academics in the fields of computational and information science have been called upon to help enhance the governmentpsilas ability to fight terrorism and other crimes. Keeping in mind the special characteristics of crimes and security-related data, datamining techniques can contribute in six areas of research: information sharing and collaboration, security association mining, classification and clustering, intelligence text mining, spatial and temporal crime pattern mining, and criminal/terrorist network analysis. Grounded on social network analysis (SNA) research, criminal network analysis and terrorist network analysis have been shown to be most promising for public safety and homeland security. Based on the University of Arizonapsilas highly successful COPLINK and Dark web projects, we will discuss relevant SNA for ldquodark networksrdquo (criminal and terrorist networks). Selected techniques, examples, and case studies will be presented based on gang/narcotic networks, US extremist networks, Al Qaeda member networks, and international Jihadist web site and forum networks. Unique homeland security challenges and future directions will also be presented.
Today, a new age of engagement and collaboration has emerged withthe proliferation of usergenerated content in social networks and generally the web 2.0, rendering it particularly difficult for enterprises to monitor...
详细信息
Today, a new age of engagement and collaboration has emerged withthe proliferation of usergenerated content in social networks and generally the web 2.0, rendering it particularly difficult for enterprises to monitor and act upon all content following conventional datamining methodologies. In this paper, we present our approach for a Future Internet enabler (FITMAN Anlzer) that provides automated, social data analytics and aims at assisting enterprises in becoming more tuned to their customer needs and gaining insights into current and future trends to early embed them into product design. the FITMAN Anlzer implementation is domainindependent and allows any manufacturer to effectively train it based on his needs and create personalized reports to timely capture the right information. Our methodology includes trend analytics, polarity detection through machine learning, data querying through flexible reports and finally informative charts to visualize the results in order to help companies in their decision making procedures.
this demo illustrates a distributed system that provides online search, analysis and ordering capabilities for distributed Earth science data. the system is under development by a GMU-led consortium in a project calle...
详细信息
this demo illustrates a distributed system that provides online search, analysis and ordering capabilities for distributed Earth science data. the system is under development by a GMU-led consortium in a project called Seasonal to Interannual Earth Science Information Partner (SIESIP) as a part of a federation of information partners funded by NASA. the integrated system is composed of data, database management system (DBMS), communication protocols, data analysis tools, and an user interface. through a web-based Java GUI, users can searchthe DBMS for metadata information, conduct content-based searches, perform some initial analysis and issue an order for the selected data.
the orientation is occupying an increasingly important role in the process of determiningthe future of the students, which leads to the obligation to make it automatic and accessible, both in terms of immediate acces...
详细信息
ISBN:
(纸本)9781509057825
the orientation is occupying an increasingly important role in the process of determiningthe future of the students, which leads to the obligation to make it automatic and accessible, both in terms of immediate accessibility throughout the world or in the offering of several languages that correspond to the language used by the majority of students or persons in need of guidance. these problems can be avoided by setting up an e-orientation platform accessible from the webthis will solve the problem of accessibility, and the use of approaches and targeted ontologies can handle multilingual aspects.
through the in-depth research of the existing e-commerce systems and datamining technologies,this paper presents a design of e-commerce recommendation system based on web *** design can achieve the separation of offl...
详细信息
through the in-depth research of the existing e-commerce systems and datamining technologies,this paper presents a design of e-commerce recommendation system based on web *** design can achieve the separation of offline and online modules,to meet the massive data requirements,which greatly improves the personalized recommendation service capabilities and *** the same time,it also integrates webdatamining technology to provide users with high-quality personalized recommendation services in the case of less data used or relatively frequent changes in web site contents.
the paper analyses usefulness of the Pillar 3 financial and risk information disclosures to the commercial banks users. the Pillar 3 are specific regulatory disclosures requirements set out in the Basel 2 framework an...
详细信息
ISBN:
(纸本)9783319220536;9783319220529
the paper analyses usefulness of the Pillar 3 financial and risk information disclosures to the commercial banks users. the Pillar 3 are specific regulatory disclosures requirements set out in the Basel 2 framework and incorporated into EU law and subsequently laws of the member states. According to Pillar 3 intention market participants should be able to understand and subsequently judge the relevance of the bank risk position and risk management and try to discipline "risky" banks. Due to that the European authorities are focused on control and improvements of the banks' disclosures. However, less is done as far as usefulness of the Pillar 3 risk information to the commercial banks users is. the authors try to assess at which extent is information useful for users of the banks that operate in countries where banking sectors are dominated by foreign-owned entities and depositors (sophisticated and non sophisticated;insured and uninsured;primarily non-financial ones) is a key source of market discipline. the authors focus on modelling of visitor behaviours at website where financial and risk information according to Pillar 3 requirements is available. the results show that there is in general small interest in Pillar 3 information and even financial and risk related information belongs to those where interests is the lowest one.
Withthe wide adoption of service and cloud computing, nowadays we observe a rapidly increasing number of services and their compositions, resulting in a complex and evolving service ecosystem. Facing a huge number of...
详细信息
ISBN:
(纸本)9783642450051;9783642450044
Withthe wide adoption of service and cloud computing, nowadays we observe a rapidly increasing number of services and their compositions, resulting in a complex and evolving service ecosystem. Facing a huge number of services with similar functionalities, how to identify the core services in different domains and recommend the trustworthy ones for developers is an important issue for the promotion of the service ecosystem. In this paper, we present a heterogeneous network model, and then a unified reputation propagation (URP) framework is introduced to calculate the global reputation of entities in the ecosystem. Furthermore, the topic model based on Latent Dirichlet Allocation (LDA) is used to cluster the services into specific domains. Combining URP withthe topic model, we re-rank services' reputations to distinguish the core services so as to recommend trustworthy domain-aware services. Experiments on Programmablewebdata show that, by fusing the heterogeneous network model and the topic model, we gain a 66.67% improvement on top20 precision and 20%similar to 30% improvement on long tail (top200 similar to top500) precision. Furthermore, the reputation and domain-aware recommendation method gains a 118.54% improvement on top10 precision.
this demonstration presents Apollo, a new sensor information processing tool for uncovering likely facts in noisy participatory sensing data 1. Participatory sensing, where users proactively document and share their o...
详细信息
the identification of regulatory modules is one of the most important tasks in order to discover disease markers. this paper presents a methodology to infer coexpression networks based on local patterns in gene expres...
详细信息
the identification of regulatory modules is one of the most important tasks in order to discover disease markers. this paper presents a methodology to infer coexpression networks based on local patterns in gene expression data matrix. In the proposed algorithm two steps can clearly be differentiated. Firstly, a Biclustering procedure that uses a Scatter search schema to find biclusters and, secondly, a network extraction procedure based on linear correlations among the genes of the previously obtained bicluster. Experimental results from Yeast cell Cycle are reported where three different algorithms have been applied. Also, a possible understanding of one of the obtained networks has been presented from a biological point of view.
暂无评论