lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Erick Erickson" <erickerick...@gmail.com>
Subject Re: query syntax question
Date Fri, 11 May 2007 00:02:17 GMT
I've thought about a flag field, and I see no reason why that wouldn't
work quite well, it all depends, I suppose, upon how ugly it would
eventually get....

But about caching, what does making a filter have to do with Lucene
caching<G>? Sure, there exist Lucene filter caching classes, but
there's no reason you could not implement, say, a singleton class
whose first invocation filled the underlying filter out and just use that.
in other parts of your program.

Anyway, it sounds like you're well on the way to a solution.

Good luck!
Erick

On 5/10/07, Les Fletcher <les@affinitycircles.com> wrote:
>
> Unfortuantely at the moment we don't make good use of lucene caching, so
> the setting up of the filter on startup doesn't really work for us at
> the moment.  Maybe just a general flag field instead of a hasname field
> would work better and be more general.  You could just fill this field
> with any flags that need to be set for the particular document.  Each
> flag has a different unique tag that gets concatinated into the field
> then you can make use of your filters with:
>
> flagfield:(+hasfirstname +haslastname)
>
> and the like.
>
> How does this sound?
>
> Les
>
> Erick Erickson wrote:
>
> > I was going to suggest something about TermEnum/TermDocs, but
> > upon reflection that doesn't work so well because you have to
> > enumerate all the terms over all the docs for a field. Ouch.
> >
> > But one could combine the two approaches. Don't index
> > any "special" values in your firstname or lastname fields.
> > I suspect this will hurt you down the road...
> >
> > Instead, ONLY for those documents that DO have either a first
> > or last name, index an orthogonal field, HASNAMES with
> > a single value of "yes" or something. Now you can construct
> > your filter efficiently by enumerating the HASNAMES terms/docs
> > by only enumerating a single term/value.
> >
> > Depending upon how big your index is, you might be able to
> > get away with the termenum/terndocs approach by constructing
> > your filter at start-up time and caching it away somewhere...
> >
> > You *might* also be able to do something at startup time like
> > for (each document in the index) {
> >   get the firstname and lastname. If both null, set your filter bit
> > }
> >
> > If you use the lazy loading, you may be able to do this without
> > loading all of every document....
> >
> > I'm curious to know what you settle on and how it works....
> >
> > Does this make sense in your application?
> >
> > Erick
> >
> > On 5/10/07, Les Fletcher <les@affinitycircles.com> wrote:
> >
> >>
> >> I like the idea of the filter since I am making heavy use of filters
> for
> >> this particular query, but how would one go about constructing it
> >> efficiently at query time?  All I can see is hacking around not being
> >> able to use the * as the first character.
> >>
> >> Les
> >>
> >> Erick Erickson wrote:
> >>
> >> > You could create a Lucene Filter that had a bit for each document
> that
> >> > had a first or last name and use that at query time to restrict your
> >> > results appropriately. You could create this at startup time or at
> >> > query time. See CachingWrapperFilter for a way to cache it.
> >> >
> >> >
> >> > Another approach would be to add a dummy field to each document,
> >> > something like HASFIRSTORLASTNAME. At index time, when
> >> > you index a document, if it has a first or last name, put "yes" in
> the
> >> > field. Otherwise, put "no".
> >> >
> >> > Then, at search time, add an +HASFIRSTORLASTNAME:yes to the
> >> > query......
> >> >
> >> > You could add as many states to this field as you want.
> >> >
> >> >
> >> > Erick
> >> >
> >> >
> >> > On 5/10/07, Les Fletcher <les@affinitycircles.com> wrote:
> >> >
> >> >>
> >> >> I have a question about empty fields.  I want to run a query that
> >> will
> >> >> search against a few particular fields for the query term but then
> >> also
> >> >> also check to see if a two other fields have any value at all.
> >> i.e., I
> >> >> want to search for a set records but don't want to return a record
> if
> >> >> that record has blank first and last name fields.  Any help would be
> >> >> greatly appreciated.
> >> >>
> >> >> Les
> >> >>
> >> >>
> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message