lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rafael Cunha de Almeida <almeida...@gmail.com>
Subject Re: Benchmarking my indexer
Date Sun, 02 Nov 2008 23:06:56 GMT
On Sun, 2 Nov 2008 07:11:20 -0500
Grant Ingersoll <gsingers@apache.org> wrote:

> 
> On Nov 1, 2008, at 1:39 AM, Rafael Cunha de Almeida wrote:
> 
> > Hello,
> >
> > I did an indexer that parses some files and indexes them using  
> > lucene. I
> > want to benchmark the whole thing, so I'd like to count the tokens
> > being indexed so I can calculate the average number of indexed tokens
> > per second. Is there a way to count the number of tokens on a  
> > document?
> 
> I think you would have to add a "CountingTokenFilter", that you write  
> and manage as you add documents.  Or, you could just take the total #  
> of tokens / by the number of docs and use the average.  That can be  
> obtained w/o writing a new TokenFilter.

How would I obtain the total number of tokens on an index? I couldn't
find that statistic anywhere. I looked for it on IndexWritter,
IndexReader and IndexSearcher classes. Is there maybe some tool I'd run
on a index or something like that?
> 
> >
> > While I'm at it, I will also need to calculate the amount of memory my
> > java program used (peak, avg, etc), what java tool would you suggest  
> > me
> > to figure that out?
> 
> 
> Would JConsole work: http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html

>    help?  I'm not sure what people use here
> 

Will look into it :-)

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message