lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tatu Saloranta <t...@hypermall.net>
Subject Re: QueryParser and compound words
Date Wed, 12 Mar 2003 06:10:28 GMT
On Tuesday 11 March 2003 03:05, Magnus Johansson wrote:
> Hello
>
> I have written an Analyzer for swedish. Compound words are common in
> swedish, therefore my Analyzer tries to split the compound words
> into its parts. For example the swedish word fotbollsmatch (football
> game) is split into fotboll and match.

(same applies to many other languages so this is a common problem I think).

However... I'm not sure why you consider this a problem? The reason quotes
are added is that since a single token (as parsed by QueryParser) expands no
multiple terms, it becomes a PhraseQuery. Same happen (should happen)
during indexing, so end result should match word in both "normal" case (word 
is correctly spelled as compound word) and when word is (incorrectly) spelled 
with spaces?
As to quotes; they are only shown when converting query to a String; 
internally there are no quotes to be matched.

-+ Tatu +-


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message