lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lukas Zapletal <l...@root.cz>
Subject Re: Escaping bug \( and ? or *
Date Sat, 08 Feb 2003 09:14:07 GMT
> Tatu Saloranta wrote:
>
>> I think the problem is that the analyzer you used for indexer strips 
>> out parenthesis. So, text actually indexed would look something like:
>> "test 1 test 2" (assuming 'and' is a stop word removed). Thus there's
>> no token matching term "(1)" or "(2)".
>> Same goes for most other punctuation characters, they are routinely
>> stripped by analyser, as they usually are not very useful for searching.
>>
>> To make it work the way you want, you need to modify analyzer to 
>> included parentesis, perhaps so that they are included only if
>> they contain just single alpha-numeric token (otherwise
>> "(1 and 2)" would be tokenized to "(1" and "2)" which is probably
>> not what you want?
>
Well this doesn`t work. Check the bugzilla for the example: ESCAPING BUG 
\(abc\) and \(a*c\) in v1.2

Can anyone help me with it?

-- 
Lukas Zapletal      [lzap@root.cz]
http://www.tanecni-olomouc.cz/lzap




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org


Mime
View raw message