lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Uwe Schindler" <...@thetaphi.de>
Subject RE: How to get the number of unique terms in the inverted index
Date Thu, 27 May 2010 21:31:02 GMT
It's not efficient, because you cannot get it efficient as of overlapping terms (as noted before).

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: kannan chandrasekaran [mailto:ckannanck@yahoo.com]
> Sent: Thursday, May 27, 2010 11:01 PM
> To: java-user@lucene.apache.org
> Subject: Re: How to get the number of unique terms in the inverted index
> 
> Hi Yonik,
> 
> Thanks for the quick response. I am curious as to why this is not supported
> whereas the numdocs() is supported ? Even in the upcoming version its only
> supported per segment and not across the index,  why ? Is it difficult to
> implement efficiently ?
> 
> Pardon my ignorance if I am missing something thats very obvious...
> 
> Thanks
> Kannan
> 
> On Thu, May 27, 2010 at 2:32 PM, kannan chandrasekaran
> <ckannanck@yahoo.com> wrote:
> > I was wondering  if there is a way to retrieve the number of unique
> > terms in the lucene
> ( version 2.4.0) ... I am aware of the terms() && terms(Term) method that
> returns an enumeration (TermEnum) but that involves iterating through the
> terms and couting them.
>  I looking for something similar to numdocs() in the IndexReader class.
> 
> No there is not.
> In 4.0-dev, with the new "flex" APIs, you can retrieve the number of unique
> terms in a single segment (Terms.getUniqueTermCount()), but not a whole
> index.
> 
> -Yonik
> http://www.lucidimagination.com
> 
> 
> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message