lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yahootintin.11533...@bloglines.com
Subject Re: Strategy for making short documents not bubble to the top?
Date Wed, 29 Jun 2005 22:39:06 GMT
Hi Jian,

Thanks for the reply.  The problem with that is it completely
ignores document length.  A book that mentions "frog" 5 times in its 2,000
pages should be less relevant than a book that mentions "frog" 4 times in
its 4 pages.

I really want to lower the document length weight instead
of removing it completely.  Any ideas how to do that?

Thanks.

--- java-user@lucene.apache.org
wrote:
Hi,
> 
> I would use pure span or cover density based ranking algorithm
which
> do not take document length into consideration. (tweaking whatever

> currently in the standard Lucene distribution?)
> 
> For example, searching
for the keywords "beautiful house", span/cover
> ranking will treat a long
document and a short document the same
> ranking as long as they have the
same number of spans/covers (for
> example, "beautiful xxxxxx house" is one
cover), and with each
> span/cover, the editing distance between the keywords
is the same.
> 
> Just my 2 cents, 
> 
> Cheers,
> 
> Jian
> 
> On
29 Jun 2005 20:30:49 -0000, yahootintin.11533894@bloglines.com
> <yahootintin.11533894@bloglines.com>
wrote:
> > Hi,
> > 
> > Short documents bubble to the top of the results
because the field
> > length is short.  Does anyone have a good strategy
for working around this?
> >  Will doing something like log(document length)
flatten out my results while
> > still making them meaningful?  I'm going
to try some different approaches
> > but any advice is appreciated.
> >

> > Thanks.
> > 
> > ---------------------------------------------------------------------

> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >
For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
>
>
> 
> ---------------------------------------------------------------------

> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For
additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message