Over the years the number of research publications per year is growing exponentially. Finding research papers of quality from the massive literature of relevant articles is a challenging and time-consuming task. The a...
详细信息
Over the years the number of research publications per year is growing exponentially. Finding research papers of quality from the massive literature of relevant articles is a challenging and time-consuming task. The approaches in the latest literature address citation recommendation by utilizing large bibliographic information and use machine learning and deep learning methods for the task. These techniques clearly require a large amount of training data as well as machines with high processing power. To overcome these issues, we propose a novel method by modifying the popular hyperlink induced topic search (hits), a web page ranking algorithm, as citation recommendation using hyperlink induced topic search (CR-hits) that works on a directed and weighted heterogeneous bibliographic network containing diverse types of nodes and edges. We define effective scoring schemes for nodes and edges based on basic bibliographic information like citations of papers, number of publications of an author, etc. Given a few seed papers, the citation recommendation algorithm CR-hits is run on small neighborhoods of the seed papers and hence the time taken by the execution is very small to yield the final recommendations. To the best of our knowledge, hits has been used for the first time for the citation recommendation problem. We perform extensive experimentation on DBLP (version-11) and ACM (version-9) datasets and compare the results with many baseline methods in terms of MAP, MRR, and recall@N measures. The performance of the proposed algorithms is superior with respect to the MAP metric and matches the second best for the other two metrics. Since the top two algorithms use deep learning methods and use much larger bibliographic information including abstracts of the papers, we claim that our approach utilizes very low resources, yet yields recommendations that are very close to the top recommendations.
Many automatic image annotation methods are based on the learning by example paradigm. Image tagging, through manual image inspection, is the first step towards this end. However, manual image annotation, even for cre...
详细信息
ISBN:
(纸本)9781538619568
Many automatic image annotation methods are based on the learning by example paradigm. Image tagging, through manual image inspection, is the first step towards this end. However, manual image annotation, even for creating the training sets, is time-consuming, complicated and contains human subjectivity errors. Thus, alternative ways for automatically creating training examples, i.e., pairs of images and tags, are crucial. As we showed in one of our previous studies, tags accompanying photos in social media and especially the Instagram hashtags can be used for image annotation. However, it turned out that only a 20% of the Instagram hashtags are actually relevant to the content of the image they accompany. Identifying those hashtags through crowdsourcing is a plausible solution. In this work, we investigate the effectiveness of the hits algorithm for identifying the right tags in a crowdsourced image tagging scenario. For this purpose, we create a bipartite graph in which the first type of nodes corresponds to the annotators and the second type to the tags they select, among the hashtags, to annotate a particular Instagram image. From the results, we conclude that the authority value of the hits algorithm provides an accurate estimation of the appropriateness of each Instagram hashtag to be used as a tag for the image it accompanies while the hub value can be used to filter out the dishonest annotators.
hits (HyperLink Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the corr...
详细信息
ISBN:
(纸本)9781509023035
hits (HyperLink Induced Topic Search) is a classical link analysis algorithm for analyzing WSM (Web Structure Mining). The algorithm takes into consideration of the structural information of links but ignores the correlation between pages and topics. In some cases, the problem of "topic drift" a deviation between search and topic would appear. For this purpose, the current paper presents an improved algorithm, by taking into account both of the web content similarity and link analysis. Our experiment shows that the improved algorithm has enhanced the correlation of search results and limited the occurrence of topic drift to some degree.
We present a nation-wide network analysis of non-fatal opioid-involved overdose journeys in the United States. Leveraging a unique proprietary dataset of Emergency Medical Services incidents, we construct a journey-to...
详细信息
We present a nation-wide network analysis of non-fatal opioid-involved overdose journeys in the United States. Leveraging a unique proprietary dataset of Emergency Medical Services incidents, we construct a journey-to-overdose geospatial network capturing nearly half a million opioid-involved overdose events spanning 2018-2023. We analyze the structure and sociological profiles of the nodes, which are counties or their equivalents, characterize the distribution of overdose journey lengths, and investigate changes in the journey network between 2018 and 2023. Our findings include that authority and hub nodes identified by the hits algorithm tend to be located in urban areas and involved in overdose journeys with particularly long geographical distances.
Key classes are considered the most important classes of a software system. They represent the starting point for reengineering or documentation processes. The detection of key classes is considered crucial in the sta...
详细信息
ISBN:
(纸本)9798350329537;9798350329520
Key classes are considered the most important classes of a software system. They represent the starting point for reengineering or documentation processes. The detection of key classes is considered crucial in the state of the art, many studies are focused on automatic detection based on class graph system representation. Studies show that class attributes computed with algorithms like Hyperlink-Introduced Topic Search (hits) and PageRank (PR) give the best precision and recall performance values in detecting key classes. The runtime execution of the two algorithms is critical when they run on graphs having different class relationship weights. To ameliorate the time execution problem we experiment with the two algorithm parallel implementations based on Java: i) single thread, ii) platform or operating system threads and iii) virtual threads. The experiments are fulfilled on a set of 14 Java projects. The results show that single thread implementations for project having a relative small number of classes, namely under 1,200, perform better than platform threads implementations. Conversely, virtual threads perform better than any single thread implementation. We conclude that virtual thread model speeds up the computation of attributes with a runtime decrease of 58.41% against the single thread model.
Key classes are deemed as pivotal elements within a software system, serving as focal points for reengineering or documentation endeavors. The identification of these key classes holds significant importance in contem...
详细信息
ISBN:
(纸本)9798350364309;9798350364293
Key classes are deemed as pivotal elements within a software system, serving as focal points for reengineering or documentation endeavors. The identification of these key classes holds significant importance in contemporary practices, with numerous studies dedicated to automating their detection based on representations within class graphs. Research indicates that employing algorithms such as Hyperlink-Introduced Topic Search (hits) and PageRank (PR) yields optimal precision and recall performance in identifying key classes. However, the runtime execution of these algorithms becomes critical, particularly when operating on graphs with varied weights attributed to class relationships. To address the challenge of runtime execution, we explore parallel implementations of these algorithms utilizing CUDA, invoked from a Java application through JCuda. Specifically, we investigate two approaches: i) employing Java virtual threads and ii) utilizing CUDA threads within the context of the JCuda library. CUDA has fundamentally transformed how we harness GPU acceleration across diverse computational tasks, spanning parallel processing, deep learning, and highperformance computing. Our experiments are conducted on a data set comprising 14 Java projects. Our findings reveal that the hardware parallel CUDA threading model significantly accelerates attribute computation, achieving a runtime reduction of 95% to 97% compared to the virtual threading model.
When a user connects to the Internet to fulfill his needs, he often encounters a huge amount of related information. Recommender systems are the techniques for massively filtering information and offering the items th...
详细信息
When a user connects to the Internet to fulfill his needs, he often encounters a huge amount of related information. Recommender systems are the techniques for massively filtering information and offering the items that users find them satisfying and interesting. The advances in machine learning methods, especially deep learning, have led to great achievements in recommender systems, although these systems still suffer from challenges such as cold-start and sparsity problems. To solve these problems, context information such as user communication network is usually used. In this article, we have proposed a novel recommendation method based on matrix factorization and graph analysis methods, namely Louvain for community detection and hits for finding the most important node within the trust network. In addition, we leverage deep autoencoders to initialize users and items latent factors, and the Node2vec deep embedding method gathers users' latent factors from the user trust graph. The proposed method is implemented on Ciao and Epinions standard datasets. The experimental results and comparisons demonstrate that the proposed approach is superior to the existing state-of-the-art recommendation methods. Our approach outperforms other comparative methods and achieves great improvements, that is, 15.56% RMSE improvement for Epinions and 18.41% RMSE improvement for Ciao.
Android operating system occupies a high share in the mobile terminal market. It promotes the rapid development of Android applications (apps). However, the emergence of Android malware greatly endangers the security ...
详细信息
Android operating system occupies a high share in the mobile terminal market. It promotes the rapid development of Android applications (apps). However, the emergence of Android malware greatly endangers the security of Android smartphone users. Existing research works have proposed a lot of methods for Android malware detection, but they did not make the utilization of apps' functional category information so that the strong similarity between benign apps in the same functional category is ignored. In this paper, we propose an Android malware detection scheme based on the functional classification. The benign apps in the same functional category are more similar to each other, so we can use less features to detect malware and improve the detection accuracy in the same functional category. The aim of our scheme is to provide an automatic application functional classification method with high accuracy. We design an Android application functional classification method inspired by the hyperlink induced topic search (hits) algorithm. Using the results of automatic classification, we further design a malware detection method based on app similarity in the same functional category. We use benign apps from the Google Play Store and use malware apps from the Drebin malware set to evaluate our scheme. The experimental results show that our method can effectively improve the accuracy of malware detection.
The fast expansion of the World Wide Web has made it increasingly harder to navigate through its tremendous content, necessitating green and effective internet net page rating algorithms for serps like google. Traditi...
详细信息
Rapid advancement in information technology promotes the growth of new online learning communities in an e-learning environment that overloads information and data sharing. When a new learner asks a question, how a sy...
详细信息
Rapid advancement in information technology promotes the growth of new online learning communities in an e-learning environment that overloads information and data sharing. When a new learner asks a question, how a system recommends the answer is the problem of the learner's cold start. In this article, our contributions are: (i) We proposed a Trust-aware Deep Neural Recommendation (TDNR) framework that addresses learner cold-start issues in informal e-learning by modeling complex nonlinear relationships. (ii) We utilized latent Dirichlet allocation for tag modeling, assigning tag categories to newly posted questions and ranking experts related to specific tags for active questioners based on hub and authority scores. (iii) We enhanced recommendation accuracy in the TDNR model by introducing a degree of trust between questioners and responders. (iv) We incorporated the questioner- responder relational graph, derived from structural preference information, into our proposed model. We evaluated the proposed model on the Stack Overflow dataset using mean absolute precision (MAP), root mean squared error (RMSE), and F-measure metrics. Our significant fi ndings are that TDNR is a hybrid approach that provides more accurate recommendations compared to rating-based and social- trust-based approaches, the proposed model can facilitate the formation of informal e-learning communities, and experiments show that TDNR outperforms the competing methods by an improved margin. The model's robustness, demonstrated by superior MAE, RMSE, and F-measure metrics, makes it a reliable solution for addressing information overload and user sparsity in Stack Overflow. By accurately modeling complex relationships and incorporating trust degrees, TDNR provides more relevant and personalized recommendations, even in cold-start scenarios. This enhances user experience by facilitating the formation of supportive learning communities and ensuring new learners receive accurate recommendations.
暂无评论