lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Istvan Soos <>
Subject best practice on too many files vs IO overhead
Date Fri, 27 Nov 2009 09:23:01 GMT

I've a requirement that involves frequent, batched update of my Lucene
index. This is done by a memory queue and process that periodically
wakes and process that queue into the Lucene index.

If I do not optimize my index, I'll receive "too many open files"
exception (yeah, right, I can get the OS's limit up a bit, but that
just prolongs the exception).
If I do optimize my index, I'll receive a very large IO overhead (as
it reads again and writes the whole index).

Right now I'm optimizing the index on each batch cycle, but as my
index size quickly goes to around 1GB, I experience great overhead in
the IO operations. The update shall happen frequently (1-10 times per
minute), so I'm looking for advices how to solve this issue. I might
split the index, but that way I'll receive the "too many open files"
sooner, and in the end the IO overhead remains...

Any suggestions?

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message