lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: query syntax question
Date Thu, 10 May 2007 21:04:51 GMT
I was going to suggest something about TermEnum/TermDocs, but
upon reflection that doesn't work so well because you have to
enumerate all the terms over all the docs for a field. Ouch.

But one could combine the two approaches. Don't index
any "special" values in your firstname or lastname fields.
I suspect this will hurt you down the road...

Instead, ONLY for those documents that DO have either a first
or last name, index an orthogonal field, HASNAMES with
a single value of "yes" or something. Now you can construct
your filter efficiently by enumerating the HASNAMES terms/docs
by only enumerating a single term/value.

Depending upon how big your index is, you might be able to
get away with the termenum/terndocs approach by constructing
your filter at start-up time and caching it away somewhere...

You *might* also be able to do something at startup time like
for (each document in the index) {
   get the firstname and lastname. If both null, set your filter bit
}

If you use the lazy loading, you may be able to do this without
loading all of every document....

I'm curious to know what you settle on and how it works....

Does this make sense in your application?

Erick

On 5/10/07, Les Fletcher <les@affinitycircles.com> wrote:
>
> I like the idea of the filter since I am making heavy use of filters for
> this particular query, but how would one go about constructing it
> efficiently at query time?  All I can see is hacking around not being
> able to use the * as the first character.
>
> Les
>
> Erick Erickson wrote:
>
> > You could create a Lucene Filter that had a bit for each document that
> > had a first or last name and use that at query time to restrict your
> > results appropriately. You could create this at startup time or at
> > query time. See CachingWrapperFilter for a way to cache it.
> >
> >
> > Another approach would be to add a dummy field to each document,
> > something like HASFIRSTORLASTNAME. At index time, when
> > you index a document, if it has a first or last name, put "yes" in the
> > field. Otherwise, put "no".
> >
> > Then, at search time, add an +HASFIRSTORLASTNAME:yes to the
> > query......
> >
> > You could add as many states to this field as you want.
> >
> >
> > Erick
> >
> >
> > On 5/10/07, Les Fletcher <les@affinitycircles.com> wrote:
> >
> >>
> >> I have a question about empty fields.  I want to run a query that will
> >> search against a few particular fields for the query term but then also
> >> also check to see if a two other fields have any value at all.  i.e., I
> >> want to search for a set records but don't want to return a record if
> >> that record has blank first and last name fields.  Any help would be
> >> greatly appreciated.
> >>
> >> Les
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message