lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From bhecht <bhe...@ams-sys.com>
Subject Re: stop words, synonyms... what's in it for me?
Date Mon, 21 May 2007 20:53:35 GMT

Thanks Daniel,

But when searching, I will run my "standardization" tools again before
querying Lucene, so what you mentioned will not be a problem.
If someone searches for mainstrasse, my tools will split it again to main
and strasse, and then lucene will be able to find it.


Daniel Naber-5 wrote:
> 
> On Monday 21 May 2007 22:05, bhecht wrote:
> 
>> Is there any point for me to start creating custom analyzers with filter
>> for stop words, synonyms, and implementing my own "sub string" filter,
>> for separating tokens into "sub words" (like "mainstrasse"=> "main",
>> "strasse")
> 
> Yes: I assume your document should be found both with "strasse" and with 
> "mainstrasse". You will then need to put main, strasse, and mainstrasse at 
> the same position (setPositionIncrement(0)). If you don't do that, phrase 
> queries will not work anymore as expected. Thus you need an analyzer, 
> modifying the string before they are put in Lucene is not enough.
> 
> Regards
>  Daniel
> 
> -- 
> http://www.danielnaber.de
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/stop-words%2C-synonyms...-what%27s-in-it-for-me--tf3792510.html#a10726812
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message