lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Peter A. Friend" <>
Subject Re: Optimizing indexes with mulitiple processors?
Date Fri, 10 Jun 2005 17:09:16 GMT

On Jun 10, 2005, at 9:33 AM, Chris Collins wrote:

> How many documents did you try to index?

Only about 4000 at the moment.

>   I am using a relatively large
> minMergeDoc that causes me to run out of memory when I make such a  
> change. (I
> am using 1/2 gb of heap btw).

I was running out of memory as well until I gave Java a larger heap  
to work with. I am assuming that a dedicated indexing machine (as  
well as search) is going to need a mountain of memory. I figure I  
will be giving Java gigs to play with.

> I believe changing it in the outputstream object
> means that a lot of in memory only objects use that size too.

This I need to look into. At a guess, I would think that there would  
be an OutputStream object for each open segment, and each file in  
that segment. A consolidated index *might* use less but of course we  
are trying to improve performance here, and the consolidated index  
does incur a cost. Assuming 10 segments and 10 files within each  
segment, that's 100 OutputStream objects or 809,600 bytes. That'll  
grow quickly with merge tweaks. Those larger writes do save a bunch  
of system calls and make (maybe) better use of your filers block  
size. This grows quickly with maxMerge tweaks. Of course this could  
be utterly incorrect, I need to look into this a bit more carefully.

> I dont know I would of used truss in this regard, this only points  
> out what
> size hit the kernel not what went over the wire.  I would suggest  
> using
> ethereal to ensure thats how its ending up on the wire.

True, hadn't gotten that far yet. :-)


To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message