lucene-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Muir <>
Subject Re: inconsistency/performance trap of empty terms
Date Sat, 30 Oct 2010 13:14:41 GMT
On Sat, Oct 30, 2010 at 9:00 AM, Earwin Burrfoot <> wrote:
> Speaking of consistency, I think NOT_ANALYZED is superfluous. Drop
> this mode, and it can be safely reproduced by a NotAnalyzingAnalyzer
> (insert better name here).

+1. This is confusing and comes up often on the user list.

The way I think it happens is like this:
Joe Schmoe, like a good user, just fires up StandardAnalyzer at both
index and query time.
Joe realizes he has a field that really shouldnt be tokenized, and
sets it to NOT_ANALYZED.
Joe is confused that queries dont work the way he should when he does
this, since its still analyzed by the queryparser with

It would be far better to force him to use PerFieldAnalyzerWrapper +
NotAnalyzingAnalyzer or whatever, since then it would work
besides, if he sets this NotAnalyzed, it actually goes thru 'analysis'
anyway: SingleTokenAttributeSource buried in the indexer.
And, in trunk, this means things like UTF-8 encoding are assumed, but
really this should be completely outside of the indexer.

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message