lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shai Erera <ser...@gmail.com>
Subject Re: Filtering question
Date Wed, 11 Mar 2015 13:05:55 GMT
I don't see that you use acceptDocs in your MyNDVFilter. I think it would
return false for all userB docs, but you should confirm that.

Anyway, because you use an NDV field, you can't automatically skip
unrelated documents, but rather your code would look something like:

for (int i = 0; i < reader.maxDoc(); i++) {
  if (!acceptDocs.get(i)) {
    continue;
  }
  // document is accepted, read values
  ...
}

Shai

On Wed, Mar 11, 2015 at 1:25 PM, Ian Lea <ian.lea@gmail.com> wrote:

> Can you use a BooleanFilter (or ChainedFilter in 4.x) alongside your
> BooleanQuery?   Seems more logical and I suspect would solve the problem.
> Caching filters can be good too, depending on how often your data changes.
> See CachingWrapperFilter.
>
> --
> Ian.
>
>
> On Tue, Mar 10, 2015 at 12:45 PM, Chris Bamford <cbamford@mimecast.com>
> wrote:
>
> >
> >  Hi,
> >
> >  I have an index of 30 docs, 20 of which have an owner field of "UserA"
> > and 10 of "UserB".
> > I also have a query which consists of:
> >
> >  BooleanQuery:
> > -- Clause 1: TermQuery
> > -- Clause 2: FilteredQuery
> > ----- Branch 1: MatchAllDocsQuery()
> > ----- Branch 2: MyNDVFilter
> >
> >  I execute my search as follows:
> >
> >  searcher.search( booleanQuery,
> >                                     new TermFilter(new Term("owner",
> > "UserA"),
> >                                     50);
> >
> >  The TermFilter's job is to reduce the number of searchable documents
> > from 30 to 20, which it does for all clauses of the BooleanQuery except
> for
> > MyNDVFilter which iterates through the full 30 docs, 10 needlessly.  How
> > can I restrict it so it behaves the same as the other query branches?
> >
> >  MyNDVFilter source code:
> >
> >  public class MyNDVFilter extends Filter {
> >
> >      private String fieldName;
> >     private String matchTag;
> >
> >      public TagFilter(String ndvFieldName, String matchTag) {
> >         this.fieldName = ndvFieldName;
> >         this.matchTag = matchTag;
> >     }
> >
> >      @Override
> >     public DocIdSet getDocIdSet(AtomicReaderContext context, Bits
> > acceptDocs) throws IOException {
> >
> >          AtomicReader reader = context.reader();
> >         int maxDoc = reader.maxDoc();
> >         final FixedBitSet bitSet = new FixedBitSet(maxDoc);
> >         BinaryDocValues ndv = reader.getBinaryDocValues(fieldName);
> >
> >          if (ndv != null) {
> >             for (int i = 0; i < maxDoc; i++) {
> >                 BytesRef br = ndv.get(i);
> >                 if (br.length > 0) {
> >                     String strval = br.utf8ToString();
> >                     if (strval.equals(matchTag)) {
> >                         bitSet.set(i);
> >                         System.out.println("MyNDVFilter >> " + matchTag +
> > " matched " + i + " [" + strval + "]");
> >                     }
> >                 }
> >             }
> >         }
> >
> >          return new DVDocSetId(bitSet);    // just wraps a FixedBitSet
> >     }
> > }
> >
> >
> >
> >   Chris Bamford m: +44 7860 405292  w: www.mimecast.com  Senior
> Developer p:
> > +44 207 847 8700 Address click here
> > <http://www.mimecast.com/About-us/Contact-us/>
> > ------------------------------
> >  [image: http://www.mimecast.com]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83be674748892bc34425eb4133af3e68
> >
> >   [image: LinkedIn]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=83a78f78bdfa40c471501ae0b813a68f>
> [image:
> > YouTube]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=ad1ed1af5bb9cf9dc965267ed43faff0>
> [image:
> > Facebook]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=172d4ea57e4a4673452098ba62badace>
> [image:
> > Blog]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=871b30b627b3263b9ae2a8f37b0de5ff>
> [image:
> > Twitter]
> > <
> https://serviceA.mimecast.com/mimecast/click?account=C1A1&code=cc3a825e202ee26a108f3ef8a1dc3c6f
> >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message