lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Hostetter <>
Subject Re: Empty fields ...
Date Wed, 19 Jul 2006 19:21:29 GMT
: why invert the bitset?

i think the orriginal request was to find all docs where the field did
*not* have any value ... or in your vernacular: where Zip IS NULL

: a token containing the empty string matches documents that
: > contain that token
: >
: Isn't this exactly what he wants? Or am I mis-reading this? I'm reading it
: as "any document that contains a ZIP will match a token containing the empty
: string"..... Or am I getting tokens and terms all mixed up?

terminology is generally confusing, but in addition to the already
discussed "do you wnat docs with values, or docs without values" issue,
you also seem to be confused about one other thing: If a doc has a
Term for field "Zip" with some value then it will *not* match a Term with
the value "" for that field -- the empty string is a regular token like
any other string, and a Term with the epty string as it's value is just
like any other Term.

If you want all docs that have any value ofr a field, you can use a
TermEnum to iterate over all Terms for that field, and use a TermDOcs to
iterate over all docs that have that Term -- recording every doc with any
value for that field.  If you want all docs that *don't* have a value, you
then invert that BitSet.

RangeFilter makes that first part of the problem easy by letting you
specify "null" for both the lower and upper bounds -- you can use it as
is, or look at it for insperation for your own code.


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message