lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Doug Cutting <>
Subject Re: PageRank and Lucene javadoc
Date Tue, 21 Dec 2004 19:27:39 GMT
David Spencer wrote:
> And my feeling is that in the context of machine-generated pages, Page 
> Rank doesn't help that much.

It's better than random.  It correctly identified overview-summary as 
the best "home page" for the collection in both cases.  It also 
identified some core classes (IndexReader in Lucene, Object & String in 

> Also, it's not clear how to use it e.g. make it the Document boost or 
> put it into a separate field for use by a custom scoring function?

I think using the Document boost makes good sense.

> And...I'm pretty sure it can't easily be used w/ incremental index 
> additions as it wants an entire link graph.

A standard way to deal with this is to make a guess for new pages.  A 
new page should probably have a score somewhat less than the page which 
linked to it, and also probably a bit less than other pages at the same 
"site" that were previously known.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message