Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 75572 invoked from network); 17 Sep 2009 14:26:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 17 Sep 2009 14:26:23 -0000 Received: (qmail 8145 invoked by uid 500); 17 Sep 2009 14:26:20 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 8072 invoked by uid 500); 17 Sep 2009 14:26:20 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 8062 invoked by uid 99); 17 Sep 2009 14:26:20 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Sep 2009 14:26:20 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of mathias.bank@gmail.com designates 209.85.219.228 as permitted sender) Received: from [209.85.219.228] (HELO mail-ew0-f228.google.com) (209.85.219.228) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 17 Sep 2009 14:26:11 +0000 Received: by ewy28 with SMTP id 28so464275ewy.28 for ; Thu, 17 Sep 2009 07:25:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=lAfRwKOtfYou2ADLJ5Fl92TUT8b7EpUFknwwK4Ogkj8=; b=afSnAzwfyq4bCmmeB1CKz2tPkx+qqLYJU4qt3KIZgnYig9g0d8YI319kCSXj6r4eKx 0TZ3QCBMXZGiWQFRpMz+B5X8r/PXVhQZGqyrqixE9F52FEFirGSWCoDnkzEFaJOtl+rl BnTZmUsA4Er1Rf0TCDThENS4HxVaE9hU44nD4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=ELNhfzkisPRulcgvGzT9fhSr+asIpUD0wKIqZCPZYmP3EiZF/meCTzKyjXdFLAuh+2 U29GlaDAdBcw07ciJPiHYU9Rp75g9mqJK2e4WOnzZ5QmxPr4dpVxhPR9Sd3jd6vyhqP5 8fQJvDu0KFEA0zNWW06zoHElQREN/worbOf+s= MIME-Version: 1.0 Received: by 10.216.1.11 with SMTP id 11mr177905wec.147.1253197551011; Thu, 17 Sep 2009 07:25:51 -0700 (PDT) In-Reply-To: <001636c5bc3f125c870473c6c8d4@google.com> References: <001636c5bc3f125c870473c6c8d4@google.com> Date: Thu, 17 Sep 2009 16:25:50 +0200 Message-ID: <53480e280909170725y2d302842u4e65b5d20ef92bef@mail.gmail.com> Subject: Re: Counting search results From: Mathias Bank To: java-user@lucene.apache.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hello, I have tried your method, but it doesn't work. set will be null after applying BitSet set =3D filter.bits(reader); I haven't found any reason for this. Additionally, the bits method is deprecated and it is mentioned to use "getDocIdSet". But this set does only provide an iterator, no hash checks are possible. Are there any other possibilities to improve speed? Mathias Am 15.09.2009 17:13 schrieb Simon Willnauer : > Hmm, so if you wanna use the Filter to narrow down the search results > > you could use it in the while loop like this: > > > > BitSet set =3D filter.bits(reader); > > =C2=A0int numDocs > > TermDocs termDocs =3D reader.termDocs(new Term("myField", "myTerm")); > > while (termDocs.next()) { > > =C2=A0if(set.get(termDocs.doc())) > > =C2=A0 =C2=A0numDocs++; > > } > > > > would that help? > > > > simon > > >> > > On Tue, Sep 15, 2009 at 5:01 PM, Mathias Bank mathias.bank@gmail.com> wro= te: > > > Hello, > > > > > > This seams to be a similar solution like: > > > > > > Term t =3D new Term(fieldname, term); > > > int count =3D searcher.docFreq(t); > > > > > > The problem is, that in this situation it is not possible to apply a > > > filter object. If I don't wanna use this filter object, I would have > > > to use a complex search query, wich is - again - very slow. So, > > > unfortunatelly, your solution does not help. > > > > > > Mathias > > > > > > 2009/9/15 Simon Willnauer simon.willnauer@googlemail.com>: > > >> Did you try: > > >> int numDocs > > >> TermDocs termDocs =3D reader.termDocs(new Term("myField", "myTerm")); > > >> while (termDocs.next()) { numDocs++; } > > >> > > >> simon > > >> > > >> On Tue, Sep 15, 2009 at 2:19 PM, Mathias Bank mathias.bank@gmail.com> = wrote: > > >>> Hello, > > >>> > > >>> I'm trying to find the number of documents for a specific term to > > >>> create text statistics. I'm not interested in ordering the results or > > >>> even recieving the first result. I just need the number of results. > > >>> > > >>> Currently, I'm trying to do this by using the lucene searcher class: > > >>> > > >>> IndexSearcher searcher =3D new IndexSearcher(reader); > > >>> String queryString =3D fieldname+":" + term; > > >>> QueryParser parser =3D new QueryParser(fieldname, new GermanAnalyzer(= )); > > >>> TopDocs d =3D searcher.search(parser.parse(queryString), filter, 1); > > >>> int count =3D d.totalHits; > > >>> > > >>> The problem is, that there is a large index (optimized) with > 8 mio. > > >>> entries. One search could return a large number of search results (> = 1 > > >>> mio). Currently these search tasks take more than 15 secunds. > > >>> > > >>> The question is: is there any way to get the number of search results > > >>> faster? I think, that it could be optimized by not using a Weight > > >>> object (order is not interesting), but I haven't seen a way to do > > >>> this. > > >>> > > >>> I hope, someone has already solved this problem. > > >>> > > >>> Mathias > > >>> > > >>> --------------------------------------------------------------------- > > >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >>> For additional commands, e-mail: java-user-help@lucene.apache.org > > >>> > > >>> > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org > > >> For additional commands, e-mail: java-user-help@lucene.apache.org > > >> > > >> > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org