lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Spencer <dave-lucene-...@tropo.com>
Subject PageRank and Lucene javadoc
Date Tue, 21 Dec 2004 05:32:20 GMT

I've always wondered if it would be useful to try to fit the PageRank 
(heuristic?) into Lucene.

As an experiment I ran PageRank on 2 trees of Javadoc (the Lucene 
javadoc and the JDK1.4 javadoc) and product a report that shows the 
PageRank value for every page.

The Lucene javadoc report is here:

http://www.searchmorph.com/static/lucene-report.html

The weblog entry has a bit more details and links to the much larger 
jdk1.4 report:

http://searchmorph.com/weblog/index.php?id=29

And my feeling is that in the context of machine-generated pages, Page 
Rank doesn't help that much.

Also, it's not clear how to use it e.g. make it the Document boost or 
put it into a separate field for use by a custom scoring function? I 
think the Google scoring function is a secret.

And...I'm pretty sure it can't easily be used w/ incremental index 
additions as it wants an entire link graph.

Hope this isn't too far off topic, sorry if so, but thought it was 
relevant enough to mention...

- Dave


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message