Rendering large-scale 3-D scenes on a thin client is attracting increasing attention with the development of the mobile Internet. Efficient scene prefetching to provide timely data with a limited cache is one of the m...
详细信息
Rendering large-scale 3-D scenes on a thin client is attracting increasing attention with the development of the mobile Internet. Efficient scene prefetching to provide timely data with a limited cache is one of the most critical issues for remote 3-D data scheduling in networked virtual environment applications. Existing prefetching schemes predict the future positions of each individual user based on user traces. In this paper, we investigate scene content sequences accessed by various users instead of user viewpoint traces and propose a useraccess pattern-based 3-D scene prefetching scheme. We make a relationship graph-based clustering to partition history useraccess sequences into several clusters and choose representative sequences from among these clusters as user access patterns. Then, these user access patterns are prioritized by their popularity and users' personal preference. Based on these accesspatterns, the proposed prefetching scheme predicts the scene contents that will most likely be visited in the future and delivers them to the client in advance. The experiment results demonstrate that our useraccess pattern-based prefetching approach achieves a high hit ratio and outperforms the prevailing prefetching schemes in terms of access latency and cache capacity.
Music recommendation is receiving increasing attention as the music industry develops venues to deliver music over the Internet. The goal of music recommendation is to present users lists of songs that they are likely...
详细信息
Music recommendation is receiving increasing attention as the music industry develops venues to deliver music over the Internet. The goal of music recommendation is to present users lists of songs that they are likely to enjoy. Collaborative-filtering and content-based recommendations are two widely used approaches that have been proposed for music recommendation. However, both approaches have their own disadvantages: collaborative-filtering methods need a large collection of user history data and content-based methods lack the ability of understanding the interests and preferences of users. To overcome these limitations, this paper presents a novel dynamic music similarity measurement strategy that utilizes both content features and user access patterns. The seamless integration of them significantly improves the music similarity measurement accuracy and performance. Based on this strategy, recommended songs are obtained by a means of label propagation over a graph representing music similarity. Experimental results on a real data set collected from http://*** demonstrate the effectiveness of the proposed approach.
This paper presents new algorithms - fuzzy c-medoids (FCMdd) and robust fuzzy c-medoids (RFCMdd) - for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medo...
详细信息
This paper presents new algorithms - fuzzy c-medoids (FCMdd) and robust fuzzy c-medoids (RFCMdd) - for fuzzy clustering of relational data. The objective functions are based on selecting c representative objects (medoids) from the data set in such a way that the total fuzzy dissimilarity within each cluster is minimized. A comparison of FCMdd with the well-known relational fuzzy c-means algorithm (RFCM) shows that FCMdd is more efficient. We present several applications of these algorithms to Web mining, including Web document clustering, snippet clustering, and Web access log analysis.
To identify robots and humans and analyze their respective accesspatterns, we used the Internet Archive's (IA) Wayback Machine access logs from 2012 and 2019, as well as ***'s (Portuguese Web Archive) access ...
详细信息
ISBN:
(纸本)9783031168024;9783031168017
To identify robots and humans and analyze their respective accesspatterns, we used the Internet Archive's (IA) Wayback Machine access logs from 2012 and 2019, as well as ***'s (Portuguese Web Archive) access logs from 2019. We identified user sessions in the access logs and classified those sessions as human or robot based on their browsing behavior. To better understand how users navigate through the web archives, we evaluated these sessions to discover user access patterns. Based on the two archives and between the two years of IA access logs (2012 vs. 2019), we present a comparison of detected robots vs. humans and their user access patterns and temporal preferences. The total number of robots detected in IA 2012 is greater than in IA 2019 (21% more in requests and 18% more in sessions). Robots account for 98% of requests (97% of sessions) in *** (2019). We found that the robots are almost entirely limited to "Dip" and "Skim" accesspatterns in IA 2012, but exhibit all the patterns and their combinations in IA 2019. Both humans and robots show a preference for web pages archived in the near past.
Websites are the primary medium of any organization to communicate to their customers. Navigational usability and accessibility of the website are crucial to gain competitive advantage. Understanding how the customer ...
详细信息
ISBN:
(纸本)9781467366151
Websites are the primary medium of any organization to communicate to their customers. Navigational usability and accessibility of the website are crucial to gain competitive advantage. Understanding how the customer uses the website can provide insight into their behavior. Web server logs contain latent information about usage behavior of customers. user sessions are a sequence of pages accessed by users for a specific period. The sessions are reconstructed from the web server logs. Simulated Annealing technique is used to enhance the process of identifying sessions. Considering the non-deterministic browsing behavior, soft clustering methods are used for assigning membership value for each session to belong to a cluster. A modified form of Fuzzy C-Means is used for clustering. The framework involves access log preprocessing, user identification, session identification and Mountain density function (MDF)based fuzzy clustering. The obtained clusters represent common navigational behavior among the users.
Although user access patterns on the live web are well-understood, there has been no corresponding study of how users, both humans and robots, access web archives. Based on samples from the Internet Archive's publ...
详细信息
ISBN:
(纸本)9781450320764
Although user access patterns on the live web are well-understood, there has been no corresponding study of how users, both humans and robots, access web archives. Based on samples from the Internet Archive's public Wayback Machine, we propose a set of basic usage patterns: Dip (a single access), Slide (the same page at different archive times), Dive (different pages at approximately the same archive time), and Skim (lists of what pages are archived, i.e., Time-Maps). Robots are limited almost exclusively to Dips and Skims, but human accesses are more varied between all four types. Robots outnumber humans 10:1 in terms of sessions, 5:4 in terms of raw HTTP accesses, and 4:1 in terms of megabytes transferred. Robots almost always access Time-Maps (95% of accesses), but humans predominately access the archived web pages themselves (82% of accesses). In terms of unique archived web pages, there is no overall preference for a particular time, but the recent past (within the last year) shows significant repeat accesses.
Time duration and presence of a web page are two factors disclosing web users' interest. The time duration on a web page is characterized as a fuzzy linguistic variable because it is easily understandable for peop...
详细信息
ISBN:
(纸本)9780769536729
Time duration and presence of a web page are two factors disclosing web users' interest. The time duration on a web page is characterized as a fuzzy linguistic variable because it is easily understandable for people and the subtle difference between two durations is disregarded. Thus a web access pattern is transformed as a fuzzy web access pattern, which is a fuzzy vector that are composed of n fuzzy linguistic variable or 0. Furthermore, the clusters in web accesspatterns do not necessarily have crisp boundaries. This paper proposes a modified k-means clustering algorithm based oil properties of rough set to group the gained fuzzy, web accesspatterns. Finally, an example is provided for clustering the given web accesspatterns. The results are proved to be effective.
The significance of the web and the crucial role of web archives in its preservation highlight the necessity of understanding how users, both human and robot, access web archive content, and how best to satisfy this d...
详细信息
The significance of the web and the crucial role of web archives in its preservation highlight the necessity of understanding how users, both human and robot, access web archive content, and how best to satisfy this disparate needs of both types of users. To identify robots and humans in web archives and analyze their respective accesspatterns, we used the Internet Archive's (IA) Wayback Machine access logs from 2012, 2015, and 2019, as well as ***'s (Portuguese Web Archive) access logs from 2019. We identified user sessions in the access logs and classified those sessions as human or robot based on their browsing behavior. To better understand how users navigate through the web archives, we evaluated these sessions to discover user access patterns. Based on the two archives and between the three years of IA access logs (2012 vs. 2015 vs. 2019), we present a comparison of detected robots vs. humans and their user access patterns and temporal preferences. The total number of robots detected in IA 2012 (91% of requests) and IA 2015 (88% of requests) is greater than in IA 2019 (70% of requests). Robots account for 98% of requests in *** (2019). We found that the robots are almost entirely limited to "Dip" and "Skim" accesspatterns in IA 2012 and 2015, but exhibit all the patterns and their combinations in IA 2019. Both humans and robots show a preference for web pages archived in the near past.
暂无评论