Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm
Precedence: bulk
Reply-To: java-user@lucene.apache.org
Received-SPF: pass (athena.apache.org: domain of uwe@thetaphi.de designates
 188.138.97.18 as permitted sender)
From: "Uwe Schindler" <uwe@thetaphi.de>
To: <java-user@lucene.apache.org>
References: <1316538543.28138.YahooMailNeo@web160705.mail.bf1.yahoo.com>
 <01ab01cc77bb$4e44e770$eaceb650$@thetaphi.de>
In-Reply-To: <01ab01cc77bb$4e44e770$eaceb650$@thetaphi.de>
Subject: RE: QueryWrapperFilter and DocIdSetIterator
Date: Tue, 20 Sep 2011 20:01:53 +0200
Message-ID: <01ac01cc77bf$65295c10$2f7c1430$@thetaphi.de>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Thread-Index: AQORosgo+MAD8UlU7ycIbYHo2w06wQH2ShB7kbyEjkA=
Content-Language: de

I investigated your problem:

It's a 3.x bug in an optimization in TermQuery. All other queries work.

TermQuery assumes that the IndexReader passed into it's Scorer method is
atomic (means is a segment reader). This is not the case for your =
example
code. It uses a hash-based cache to cache document frequencies, but this
cache is only.

Searching on top-level searchers is no longer be done in Lucene since =
2.9,
but the 3.x API still supports this (trunk aka 4.0 does no longer).

Can you open an issue for the 3.4 version? I already have a fix for
TermQuery.java, it contains a wrong assumption.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Tuesday, September 20, 2011 7:33 PM
> To: java-user@lucene.apache.org; aberdeen61@yahoo.com
> Subject: RE: QueryWrapperFilter and DocIdSetIterator
>=20
> Hi,
>=20
> I don't see a problem in your code:
> If you look at the source code of QueryWrapperFilter, it will never =
return
> NULL, so it returns always a DocIdSet theat itself returns the Scorer =
of
the
> query as Iterator.
>=20
>   @Override
>   public DocIdSet getDocIdSet(final IndexReader reader) throws =
IOException
{
>     final Weight weight =3D new
> IndexSearcher(reader).createNormalizedWeight(query);
>     return new DocIdSet() {
>       @Override
>       public DocIdSetIterator iterator() throws IOException {
>         return weight.scorer(reader, true, false);
>       }
>       @Override
>       public boolean isCacheable() { return false; }
>     };
>   }
>=20
> The only reason the DISI returned by iterator() is null is the case, =
when
> the underlying query returns a null scorer (which can happen if no
documents
> match the query).
>=20
> One thing is different in your type of execution: Since Lucene 2.9,
> IndexSearcher executes the query per-segment, but you are executing =
the
> filter on the top-level IndexReader (not separately for each segment).
This
> should not be an issue in Lucene 3.x, but with Lucene trunk this will
throw
> UnsupportedOperationException.
>=20
> I think, your query seems to really return no documents, I have no =
idea,
> why.
>=20
> Uwe
>=20
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>=20
>=20
> > -----Original Message-----
> > From: aberdeen61@yahoo.com [mailto:aberdeen61@yahoo.com]
> > Sent: Tuesday, September 20, 2011 7:09 PM
> > To: java-user@lucene.apache.org
> > Subject: QueryWrapperFilter and DocIdSetIterator
> >
> > I've been trying to use the QueryWrapperFilter as part of composing =
a
set
> of
> > filters. Are there limitations on the types of queries it can wrap?
=A0When
> I try to
> > get the DocSetIdIterator for the filter it comes up null. This =
happens
> even when
> > the query is a simple TermQuery.
> >
> > The following code shows that the iterator for a QueryWrapperFilter
> returns
> > null rather than an iterator with the same document as a search =
using
the
> > query.
> > This was run using lucene-core-3.4.0.jar on java 1.6.0_27
> > Am I using this incorrectly? Are there constraints or additional
> information on
> > how a reader is supposed to be passed to the method to get a =
DocIdSet?
> >
> > On a related note,=A0I examined the TestQueryWrapperFilter source =
code in
> > lucene 3.4.0 which indicates that the=A0QueryWrapperFilter can be =
used
with
> > primitive, complex primitive and=A0non primitive Queries. I did note =
that
> the test
> > for=A0complex primitive query generates a BooleanQuery, but doesn't =
use it
> in
> > the test. However, even when I corrected that it passed the test, so =
I'm
> unclear
> > on the difference in the usage in the published test case and my =
example
> > below.
> >
> > Thanks,
> > Dan
> >
> > =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D
> > import java.io.IOException;import
> > org.apache.lucene.analysis.WhitespaceAnalyzer;import
> > org.apache.lucene.document.Document;
> > import org.apache.lucene.document.Field;
> > import org.apache.lucene.document.Field.Index;
> > import org.apache.lucene.document.Field.Store;
> > import org.apache.lucene.index.IndexReader;
> > import org.apache.lucene.index.IndexWriter;
> > import org.apache.lucene.index.IndexWriterConfig;
> > import org.apache.lucene.index.Term;
> > import org.apache.lucene.store.RAMDirectory;
> > import org.apache.lucene.util.Version;
> > import org.apache.lucene.search.DocIdSet;
> > import org.apache.lucene.search.DocIdSetIterator;
> > import org.apache.lucene.search.Filter;
> > import org.apache.lucene.search.IndexSearcher;
> > import org.apache.lucene.search.QueryWrapperFilter;
> > import org.apache.lucene.search.TermQuery;
> > import org.apache.lucene.search.TopDocs;
> >
> > public class TestQueryWrapperFilterIterator {
> > public static void main(String[] args) {
> > try {
> > IndexWriterConfig iwconfig =3D new =
IndexWriterConfig(Version.LUCENE_34,
> new
> > WhitespaceAnalyzer(Version.LUCENE_34));
> > iwconfig.setOpenMode(IndexWriterConfig.OpenMode.CREATE);
> > RAMDirectory dir =3D new RAMDirectory();
> > IndexWriter writer =3D new IndexWriter(dir, iwconfig);
> > Document d =3D new Document();
> > d.add(new Field("id", "1001", Store.YES, Index.NOT_ANALYZED));
> > d.add(new Field("text", "headline one group one", Store.YES,
> Index.ANALYZED));
> > d.add(new Field("group", "grp1", Store.YES, Index.NOT_ANALYZED));
> > writer.addDocument(d);
> > writer.commit();
> > writer.close();
> > IndexReader rdr =3D IndexReader.open(dir);
> > IndexSearcher searcher =3D new IndexSearcher(rdr);
> > TermQuery tq =3D new TermQuery(new Term("text", "headline"));
> > TopDocs results =3D searcher.search(tq, 5);
> > System.out.println("Number of search results: " + =
results.totalHits);
> > Filter f =3D new QueryWrapperFilter(tq);DocIdSet dis =3D
> > f.getDocIdSet(rdr);DocIdSetIterator it =3D dis.iterator();
> > if (it !=3D null) {
> > int docId =3D it.nextDoc();
> > while (docId !=3D DocIdSetIterator.NO_MORE_DOCS) {
> > Document doc =3D rdr.document(docId);
> > System.out.println("Iterator doc: " + doc.get("id"));
> > docId =3D it.nextDoc();
> > }
> > } else {
> > System.out.println("Iterator was null: ");
> > }
> > searcher.close();
> > rdr.close();
> > } catch (IOException ioe) {
> > ioe.printStackTrace();
> > }
> >
> > }
> > }
> >
> >
> > =
---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>=20
>=20
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org