lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anshum <ansh...@gmail.com>
Subject Re: Lucene Indexing and Search Policy
Date Wed, 21 Jan 2009 11:45:53 GMT
Its about building a custom similarity class that scores using your
normalization factors etc.
This might help in that case,
http://www.gossamer-threads.com/lists/lucene/java-user/69553
--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Wed, Jan 21, 2009 at 5:11 PM, M Seetha Ramaiah <msram@iitk.ac.in> wrote:

> Hi Anshum,
>
> Even that document says that higher frequency implied higher score. My
> doubt is if the score is based only on the frequency, won't it be
> inappropriate for Internet based search? For example, if Google did the same
> thing, when I search for "Microsoft", there is a chance that Google may give
> some false page as the first one if the frequency of the word "Microsoft" in
> that page is more than that in www.microsoft.com
>
> That being the case, how can we extend it to be appropriate for such
> searches (whose result score should not depend merely on the frequency)?
>
> - Ram
>
>
> Anshum wrote:
>
>> Hi msr,
>>
>> Perhaps this could be useful for you. Lucene implements a modified vector
>> space model in short.
>>
>> http://jayant7k.blogspot.com/2006/07/document-scoringcalculating-relevance_08.html
>>
>> --
>> Anshum Gupta
>> Naukri Labs!
>> http://ai-cafe.blogspot.com
>>
>> The facts expressed here belong to everybody, the opinions to me. The
>> distinction is yours to draw............
>>
>>
>> On Wed, Jan 21, 2009 at 4:45 PM, MSR <msram@iitk.ac.in> wrote:
>>
>>
>>
>>> Hi,
>>>
>>> Does Lucene take into consideration anything other than the frequency of
>>> the query words in a document? If it does, what are the other
>>> considerations?
>>> If it is purely based on word frequency, is it appropriate for Internet
>>> based search (where we need to consider reference count also)?
>>>
>>> Thank you,
>>> Ram
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>>
>>
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message