lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Grant Ingersoll <gsing...@apache.org>
Subject Re: How can I get the Document Frequency for a specific term??? And more questions...
Date Fri, 03 Aug 2007 23:41:47 GMT

On Aug 3, 2007, at 9:47 AM, tierecke wrote:

>
> Hi,
>
> Can I know in how many documents a term appears (DF - Document  
> Frequency)?
> Does Lucene keep it? Can I retrieve it?
>

See the TermEnum class (IndexReader.terms()

> Now - an even more advanced question:
> Since I have a 77GB index, I cut it into 25 smaller indices of 3GB  
> each and
> I query them using MultiSearcher. Is there a possibility to know  
> the DF of a
> term throughout the whole collection or do I need to ask each index  
> for the
> DF of a specific term (supposing that my first question is solvable).
>

See the MultiReader and MultiReader.terms()

> And the last question: Is there a way to know the total number of  
> documents
> in a Lucene Index? Is there a way to know the total number of  
> documents in
> multiple indexes together?

IndexReader.numDocs()
MultiReader.numDocs()

>
> I hope it's not too much. Suddenly I find myself dealing with stuff  
> I never
> dealt before.


Much better than doing the same stuff day after day for life, ain't  
it?  :-)




--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message