lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <>
Subject Re: Lucene Internals question
Date Mon, 22 Jan 2007 20:00:29 GMT
Well first Lucene checks all of the other documents in the world for any 
that that refer to the document that your adding to Lucene...and 
then...oh wait...

> Hmm..doesn't lucene scoring determine how relevant a document is to your
> query? That is what PageRank and HITS do as well, I believe. Page and
> document are the same, if you want to index a page you'll obviously 
> try to
> convert it into a document. PageRank does link analysis to determine how
> relevant that page is as it relates to the query you entered, does lucene
> have something similar? How does lucene determine between two documents
> which one should score higher if they both contain a certain term? Google
> uses PageRank to make that determination, how does lucene do it?
> On 1/22/07, Nicolas Lalevée <> wrote:
>> Le Lundi 22 Janvier 2007 19:33, EDMOND KEMOKAI a écrit:
>> > Hi All
>> > This is a question for those familiar with lucene document scoring. 
>> How
>> > does it compare with googles PageRank or HITS, or are they very
>> different?
>> > I have being looking at the PageRank algorithm but I'll need to
>> brush-off
>> > my math skills before delving into it:)
>> In fact Lucene is just a search engine. Then you can use the search 
>> engine
>> to
>> search in web pages, like Nutch is using Lucene. And Google is more like
>> Nutch : a web crawler plus a web-search engine. So when you are taking
>> about
>> page raking, it has nothing to do with Lucene scoring. Lucene scoring is
>> how
>> about the result entry match your query. Page raking is more about how
>> relevant is the web page. So for a document, the Lucene scoring 
>> depends on
>> the query, and the page raking is quite absolute.
>> Nicolas
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message