lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chris Collins <>
Subject Re: Optimizing indexes with mulitiple processors?
Date Fri, 10 Jun 2005 20:30:35 GMT
Dont forget that when a document is indexed it starts life in its own segment. 
If you have min merge of 4k you could have an awefull lot of 1 doc segments on
the segment stack.....thats why I run out of memory.  If that is the case that
each of these at some point has a buffer of 8k or say 64k you blow up pretty



--- "Peter A. Friend" <> wrote:

> On Jun 10, 2005, at 9:33 AM, Chris Collins wrote:
> > How many documents did you try to index?
> Only about 4000 at the moment.
> >   I am using a relatively large
> > minMergeDoc that causes me to run out of memory when I make such a  
> > change. (I
> > am using 1/2 gb of heap btw).
> I was running out of memory as well until I gave Java a larger heap  
> to work with. I am assuming that a dedicated indexing machine (as  
> well as search) is going to need a mountain of memory. I figure I  
> will be giving Java gigs to play with.
> > I believe changing it in the outputstream object
> > means that a lot of in memory only objects use that size too.
> This I need to look into. At a guess, I would think that there would  
> be an OutputStream object for each open segment, and each file in  
> that segment. A consolidated index *might* use less but of course we  
> are trying to improve performance here, and the consolidated index  
> does incur a cost. Assuming 10 segments and 10 files within each  
> segment, that's 100 OutputStream objects or 809,600 bytes. That'll  
> grow quickly with merge tweaks. Those larger writes do save a bunch  
> of system calls and make (maybe) better use of your filers block  
> size. This grows quickly with maxMerge tweaks. Of course this could  
> be utterly incorrect, I need to look into this a bit more carefully.
> > I dont know I would of used truss in this regard, this only points  
> > out what
> > size hit the kernel not what went over the wire.  I would suggest  
> > using
> > ethereal to ensure thats how its ending up on the wire.
> True, hadn't gotten that far yet. :-)
> Peter
> ---------------------------------------------------------------------
> To unsubscribe, e-mail:
> For additional commands, e-mail:

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message