SBLWPR – SIMILARITY BASED LINK WEIGHT FOR PAGERANK CALCULATION
DOI:
https://doi.org/10.47839/ijc.10.3.754Keywords:
Authority Score, Hub Score, Link structure, PageRank, Similarity, Stemming.Abstract
Search engine retrieves list of web pages which are relevant to the given query from the index and sorts the list based on the page importance score. There are different ranking algorithms available in the literature to calculate the importance score of web pages. The basis of all ranking algorithms is the link structure of the web. In existing ranking algorithms, no weight is assigned to the links by considering the similarity among the linked documents. Since links from similar documents are more important than the links from other dissimilar documents, a new method is introduced to assign weight to each link based on the similarity among the linked documents. Calculated link weight is added with existing PageRank value to calculate final PageRank. Proposed technique is compared with existing ranking algorithms using the measures precision, recall and F-measure.References
R. Kosala and H. Blockeel, Web mining research, A survey, ACM SIGKDD Explorations, 2 (1) (2000). pp. 1-15.
S. Madria, S. S. Bhowmick, W. K. Ng, and E.-P. Lim, Research issues in web data mining, Proceedings of the Conference on Data Warehousing and Knowledge Discovery, (1999). – pp. 303-319.
S. Pal, V. Talwar, and P. Mitra, Web mining in soft computing framework: relevance, state of the art and future directions, IEEE Transactions on Neural Networks, 13 (5) (2002). – pp. 1163-1177.
S. Brin and L. Page, The anatomy of a large-scale hypertextual web search engine, Computer Networks and ISDN Systems, 30 (1-7) (1998). – pp. 107-117.
J. Kleinberg, Authoritative sources in a hyperlinked environment, Proceedings of the 9th ACM-SIAM Symposium on Discrete Algorithms, (1998). – pp. 668-677.
Ask Jeeves, Inc., Teoma search engine, http://www.teoma.com.
S. Chakrabarti, B. Dom, D. Gibson, J. Klein–berg, P. Raghavan, and S. Rajagopalan, Automatic resource list compilation by analyzing hyperlink structure and associated text, Proceedings of the 7th International World Wide Web Conference, (1998). – pp. 65-74.
R. Lempel and S. Moran, The stochastic approach for link-structure analysis (SALSA) and the TKC effect, ACM Transactions on Information Systems, 19 (2000). – pp. 387-401.
D. Zhang and Y. Dong, An efficient algorithm to rank web resources, Computer Networks: The International Journal of Computer and Telecommunications networking, 33 (1-6) (2000). – pp. 449-455.
A. N. Langville and Carl D. Meyer, Google’s PageRank and Beyond: The Science of Search Engine Rankings, Princeton University Press, 2006.
R. Lempel and S. Moran, SALSA: The Stochastic Approach for Link-Structure analysis, ACM Transactions on Information Systems, 19(2) (2001). pp. 131-160.
Wenpu Xing and Ali Ghorbani, Weighted PageRank algorithm, Second Annual Conference on Communication Networks and Services Research (CNSR’04), (2004). – pp. 305-314.
S. Chakrabarti, B. Dom, and P. Indyk, Enhanced hypertext categorization using hyperlinks, Proceedings of the ACM SIGMOD International Conference on Management of Data, (1998). – pp. 307-318.
L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank Citation Ranking: Bringing Order to the Web, Technical report Stanford Digital Libraries, SIDL-WP-1999-0120, 1999.
Downloads
Published
How to Cite
Issue
Section
License
International Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.