lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Moylan <jo...@rte.ie>
Subject Re: Prevent Lucene from returning short length text...
Date Mon, 04 Oct 2004 12:50:37 GMT
...or just set a lower boost on fileds with less than $x amount of 
characters while indexing.

John

Otis Gospodnetic wrote:
> Kevin,
> 
> You could try setting index-time field length-dependent boosts.
> 
> Another possibility may be your own sorting, that takes field length in
> consideration, but I'm not sure how well that would work.
> 
> Finally, you could use your own Similarity and implement your own
> http://jakarta.apache.org/lucene/docs/api/org/apache/lucene/search/Similarity.html#lengthNorm(java.lang.String,%20int)
> 
> Otis
> 
> 
> --- "Kevin A. Burton" <burton@newsmonster.org> wrote:
> 
> 
>>I've noticed that Lucene does a very bad job at doing search ranking 
>>when text has just a few words in the body.
>>
>>For example if you searched for the word "World" in the following two
>>
>>paragraphs:
>>
>>"Hello World"
>>
>>and
>>
>>"The World is often a dangerous place"
>>
>>The first paragraph wuold probably match.
>>
>>Is there a way I can tweak lucene to return richer content?
>>
>>Kevin
>>
>>-- 
>>
>>Use Rojo (RSS/Atom aggregator).  Visit http://rojo.com. Ask me for an
>>
>>invite!  Also see irc.freenode.net #rojo if you want to chat.
>>
>>Rojo is Hiring! - http://www.rojonetworks.com/JobsAtRojo.html
>>
>>If you're interested in RSS, Weblogs, Social Networking, etc... then
>>you 
>>should work for Rojo!  If you recommend someone and we hire them
>>you'll 
>>get a free iPod!
>>    
>>Kevin A. Burton, Location - San Francisco, CA
>>       AIM/YIM - sfburtonator,  Web - http://peerfear.org/
>>GPG fingerprint: 5FB2 F3E2 760E 70A8 6174 D393 E84D 8D04 99F1 4412
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

******************************************************************************
The information in this e-mail is confidential and may be legally privileged.
It is intended solely for the addressee. Access to this e-mail by anyone else
is unauthorised. If you are not the intended recipient, any disclosure,
copying, distribution, or any action taken or omitted to be taken in reliance
on it, is prohibited and may be unlawful.
Please note that emails to, from and within RTÉ may be subject to the Freedom
of Information Act 1997 and may be liable to disclosure.
******************************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message