Return-Path: Delivered-To: apmail-lucene-java-user-archive@www.apache.org Received: (qmail 98251 invoked from network); 27 May 2010 19:44:12 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 May 2010 19:44:12 -0000 Received: (qmail 97189 invoked by uid 500); 27 May 2010 19:44:10 -0000 Delivered-To: apmail-lucene-java-user-archive@lucene.apache.org Received: (qmail 97157 invoked by uid 500); 27 May 2010 19:44:10 -0000 Mailing-List: contact java-user-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: java-user@lucene.apache.org Delivered-To: mailing list java-user@lucene.apache.org Received: (qmail 97149 invoked by uid 99); 27 May 2010 19:44:10 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 May 2010 19:44:10 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=10.0 tests=AWL,FREEMAIL_FROM,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of yseeley@gmail.com designates 74.125.82.48 as permitted sender) Received: from [74.125.82.48] (HELO mail-ww0-f48.google.com) (74.125.82.48) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 May 2010 19:44:04 +0000 Received: by wwe15 with SMTP id 15so327389wwe.35 for ; Thu, 27 May 2010 12:43:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:sender:reply-to:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:content-type:content-transfer-encoding; bh=mGUCVK448nddUW/m/Y79yks/QxcOXC8cpjAyDWg+ZjM=; b=qLnI32Zjs7ydhFe1tR/cjmcHs/BXeDWUnRRUphtHUsCCGDOLVvoIYTZ31mi5Ph6Syg EY3+rPkooWcOLdgIaaPH8x5y5iN8+5MaYdeDcvdJ4trJ3a8Q9tOze2j8Tfe1IqAjeoVK UcnQ3iVZVETNG+VmDNnVQYY7fh3aHXgkNcs1E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:reply-to:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; b=i89I2zuwctdfepyJW4Zbyj/qZZpbUY9Qc3iB77XKqJ3Az7wYu94eb3agswMJgGtMNz FetHUUkG98h5I+dOU8Ic39hYeqbPBh0p2MvQzONyvQjgx3iNyBnK7OkQWrbzRlMI9YVz tQANWgDGXumhvW/PALhRyyr9tPysii5KxKZkI= MIME-Version: 1.0 Received: by 10.216.154.69 with SMTP id g47mr984029wek.82.1274989423291; Thu, 27 May 2010 12:43:43 -0700 (PDT) Sender: yseeley@gmail.com Reply-To: yonik@lucidimagination.com Received: by 10.216.22.140 with HTTP; Thu, 27 May 2010 12:43:43 -0700 (PDT) In-Reply-To: <345502.16429.qm@web55208.mail.re4.yahoo.com> References: <345502.16429.qm@web55208.mail.re4.yahoo.com> Date: Thu, 27 May 2010 15:43:43 -0400 X-Google-Sender-Auth: YLswgA4GDvG1e9XisuTD8vN8uJg Message-ID: Subject: Re: How to get the number of unique terms in the inverted index From: Yonik Seeley To: java-user@lucene.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Thu, May 27, 2010 at 2:32 PM, kannan chandrasekaran wrote: > I was wondering =A0if there is a way to retrieve the number of unique ter= ms in the lucene ( version 2.4.0) ... I am aware of the terms() && terms(Te= rm) method that returns an enumeration (TermEnum) but that involves iterati= ng through the terms and couting them. =A0I looking for something similar t= o numdocs() in the IndexReader class. No there is not. In 4.0-dev, with the new "flex" APIs, you can retrieve the number of unique terms in a single segment (Terms.getUniqueTermCount()), but not a whole index. -Yonik http://www.lucidimagination.com --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For additional commands, e-mail: java-user-help@lucene.apache.org