lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Miller <markrmil...@gmail.com>
Subject Re: Lucene Internals question
Date Mon, 22 Jan 2007 20:00:29 GMT
Well first Lucene checks all of the other documents in the world for any 
that that refer to the document that your adding to Lucene...and 
then...oh wait...

http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html

EDMOND KEMOKAI wrote:
> Hmm..doesn't lucene scoring determine how relevant a document is to your
> query? That is what PageRank and HITS do as well, I believe. Page and
> document are the same, if you want to index a page you'll obviously 
> try to
> convert it into a document. PageRank does link analysis to determine how
> relevant that page is as it relates to the query you entered, does lucene
> have something similar? How does lucene determine between two documents
> which one should score higher if they both contain a certain term? Google
> uses PageRank to make that determination, how does lucene do it?
>
> On 1/22/07, Nicolas Lalevée <nicolas.lalevee@anyware-tech.com> wrote:
>>
>> Le Lundi 22 Janvier 2007 19:33, EDMOND KEMOKAI a écrit:
>> > Hi All
>> > This is a question for those familiar with lucene document scoring. 
>> How
>> > does it compare with googles PageRank or HITS, or are they very
>> different?
>> > I have being looking at the PageRank algorithm but I'll need to
>> brush-off
>> > my math skills before delving into it:)
>>
>> In fact Lucene is just a search engine. Then you can use the search 
>> engine
>> to
>> search in web pages, like Nutch is using Lucene. And Google is more like
>> Nutch : a web crawler plus a web-search engine. So when you are taking
>> about
>> page raking, it has nothing to do with Lucene scoring. Lucene scoring is
>> how
>> about the result entry match your query. Page raking is more about how
>> relevant is the web page. So for a document, the Lucene scoring 
>> depends on
>> the query, and the page raking is quite absolute.
>>
>> Nicolas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message