Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 34617 invoked from network); 4 Dec 2009 16:27:59 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Dec 2009 16:27:59 -0000 Received: (qmail 40384 invoked by uid 500); 4 Dec 2009 16:27:57 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 40281 invoked by uid 500); 4 Dec 2009 16:27:56 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 40271 invoked by uid 99); 4 Dec 2009 16:27:56 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 16:27:56 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=HTML_MESSAGE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of peterlkeegan@gmail.com designates 209.85.160.46 as permitted sender) Received: from [209.85.160.46] (HELO mail-pw0-f46.google.com) (209.85.160.46) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 Dec 2009 16:27:47 +0000 Received: by pwj17 with SMTP id 17so2469011pwj.5 for ; Fri, 04 Dec 2009 08:27:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type; bh=3f8D44ETXQW9fBWZSFaJV2PiUPnwtTG42oWy8JbpGPI=; b=A1PEphxBcieMhM0IBtFFOwxXhDYnmEXiImeF+rpKjd859xwrlNKVNCLYb5bAwMIGus 3uHS/Jxf/biidyer5W2lKh6M2r8L6KS1FsHzp3e/PeAmDAmwdazOF9fhJ9PN6aiLkjAM PQ3q4u/o0YHIkijePqXkknLMblZFQN0+j0zzE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; b=SRs/hDJFmTQiJIO72Jwc+8BaUlMbzS3dn59py90tsTw/JRHm3x+yBGpMXGSbuAEsMg FQ5xF1ObotiqcB9xcU3hJlpo6lAIX243TpbS6FRwPjFYjtIPM7P6/9WNHhDBTgQ+J0GY GpRX+QZ+513vtJwDvD5GnhyikFBcRIjQWnsNI= MIME-Version: 1.0 Received: by 10.143.25.39 with SMTP id c39mr406590wfj.249.1259944046177; Fri, 04 Dec 2009 08:27:26 -0800 (PST) In-Reply-To: References: <9ac0c6aa0912040738m65b0ae36ud62e60cdf1660a27@mail.gmail.com> Date: Fri, 4 Dec 2009 11:27:26 -0500 Message-ID: Subject: Re: searchWithFilter bug? From: Peter Keegan To: java-user@lucene.apache.org, simon.willnauer@gmail.com Content-Type: multipart/alternative; boundary=001636e0b61e4f2ea80479e995e7 X-Virus-Checked: Checked by ClamAV on apache.org --001636e0b61e4f2ea80479e995e7 Content-Type: text/plain; charset=ISO-8859-1 The filter is just a java.util.BitSet. I use the top level reader to create the filter, and call IndexSearcher.search (Query, Filter, HitCollector). So, there is no 'docBase' at this level of the api. Peter On Fri, Dec 4, 2009 at 11:01 AM, Simon Willnauer < simon.willnauer@googlemail.com> wrote: > Peter, which filter do you use, do you respect the IndexReaders > maxDoc() and the docBase? > > simon > > On Fri, Dec 4, 2009 at 4:47 PM, Peter Keegan > wrote: > > I think the Filter's docIdSetIterator is using the top level reader for > each > > segment, because the cardinality of the DocIdSet from which it's created > is > > the same for all readers (and what I expect to see at the top level. > > > > Peter > > > > On Fri, Dec 4, 2009 at 10:38 AM, Michael McCandless < > > lucene@mikemccandless.com> wrote: > > > >> That doesn't sound good. > >> > >> Though, in searchWithFilter, we seem to ask for the Query's scorer, > >> and the Filter's docIdSetIterator, using the same reader (which may be > >> toplevel, for the legacy case, or per-segment, for the normal case). > >> So I'm not [yet] seeing where the issue is... > >> > >> Can you boil it down to a smallish test case? > >> > >> Mike > >> > >> On Fri, Dec 4, 2009 at 10:32 AM, Peter Keegan > >> wrote: > >> > I'm having a problem with 'searchWithFilter' on Lucene 2.9.1. The > Filter > >> > wraps a simple BitSet. When doing a 'MatchAllDocs' query with this > >> filter, I > >> > get only a subset of the expected results, even accounting for > deletes. > >> The > >> > index has 10 segments. In IndexSearcher->searchWithFilter, it looks > like > >> the > >> > scorer is advancing to the filter's docId, which is the index-wide > value, > >> > but the scorer is using the segment-relative value. If I optimize the > >> index, > >> > I get the expected results. > >> > Does this look like a bug? > >> > > >> > Peter > >> > > >> > >> --------------------------------------------------------------------- > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > >> For additional commands, e-mail: java-user-help@lucene.apache.org > >> > >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > For additional commands, e-mail: java-user-help@lucene.apache.org > > --001636e0b61e4f2ea80479e995e7--