lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <rcm...@gmail.com>
Subject Re: inconsistency/performance trap of empty terms
Date Sat, 30 Oct 2010 17:16:04 GMT
On Sat, Oct 30, 2010 at 12:51 PM, DM Smith <dmsmith555@gmail.com> wrote:
>
> On Oct 30, 2010, at 12:00 PM, Robert Muir wrote:
>
>> On Sat, Oct 30, 2010 at 11:54 AM, Yonik Seeley
>> <yonik@lucidimagination.com> wrote:
>>> If it's only for the QP, a simple method that one could override would suffice:
>>> QueryParser.getTokenStream(String field, String value)
>>>
>>> If it's not just for the QP, then we have Analyzer (as you've pointed out).
>>>
>>>
>>
>> right, but if we did this, it makes some things tricky (e.g. the user
>> has to manage reset(Reader)/reset() tokenStream reuse).
>> A tokenizer/tokenfilter they are using could be "heavy" in terms of
>> initialization cost.
>
> Maybe I'm missing something here. Can't there be an empty analyzer that takes a TokenStream
as an argument to its constructor and wraps it with all the reuse goodness?

you have to also wrap the string in a reader, call
Tokenizer.reset(Reader), and call reset() on the entire chain [unless
in a method like this, the QP itself would be responsible, its not
clear]

so that means, the client app has to separately track the Tokenizer
(which needs its reader reset), and the Tokenstream chain (at least to
return its value, maybe reset it).

just saying, if you do this, you didn't get rid of Analyzer, you just
made everyone write their own in their QueryParser subclass.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Mime
View raw message