lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <>
Subject RE: inconsistency/performance trap of empty terms
Date Fri, 29 Oct 2010 17:49:15 GMT
I am for the tokenfilter approach. Max Field Length is still to be
deprecated in favour of the TokenFilter.

TF is very easy, just loop over incrementToken() until it returns false or a

Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen

> -----Original Message-----
> From: Chris Hostetter []
> Sent: Friday, October 29, 2010 7:45 PM
> To: Lucene Dev
> Subject: Re: inconsistency/performance trap of empty terms
> : why not just discard them completely in say, indexer/queryparser ?
> In QueryParser: maybe, that's a high level API with assumptions about
> interaction and text.
> In the IndexWriter: it seems like a bad idea.
> Low level Lucene really shouldn't be making any assumptions about *how*
> client code is using the library -- you and i may not have any good
reasons for
> wanting an empty term, but we shouldn't put that as a hardcoded assumption
> in the low level code.
> It's essentially the converse issue of IndexWriter.maxFieldLength -- which
> deliberately changed to default to Integer.MAX_VALUE precisesly because of
> this "don't assume we know how people are using the library"
> issue -- but we could certianly make it configurable in the same way.
> (I see now that IndexWriter.maxFieldLength got deprecated in favor of
> IndexWriterConfig.maxFieldLength ... i thought i remembered that had been
> deprecated in favor of a TokenFilter that did the limiting, hence my
> that we use the same pattern for "min term length" -- it could easily be
> IndexWriterConfig option as well, but using the TokenFilter approach seems
> more useful since it can be per field)
> -Hoss
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: For additional
> commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message