lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Iain Young <Iain.Yo...@microfocus.com>
Subject RE: Disabling modifiers?
Date Tue, 16 Dec 2003 11:46:43 GMT
I think it is a problem with the indexing. I've found another example...

WS-CA-PP00-PROCESS-YYMM

I've looked at the index, and it has been tokenized into 3 words...

WS
CA-PP00-PROCESS
YYMM

Looks as though I might have to use a custom tokenizer as well as an
analyzer then, but any ideas as to why the standard tokenizer would have
split the variable up like this (i.e. why didn't it split the middle bit,
only the word off either end)? The only thing I can think of is that there
are several other variables in the source beginning with WS- or ending with
-YYMM, so could the tokenizer have seen this and be doing something clever
with them?

Thanks,
Iain

*****************************************
*  Micro Focus Developer Forum 2004     *
*  3 days that will make a difference   *
*  www.microfocus.com/devforum          *
*****************************************



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Mime
View raw message