lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Toke Eskildsen ...@statsbiblioteket.dk>
Subject Re: deleting 8,000,000 indexes takes forever!!!! any solution to this...
Date Wed, 06 Jul 2011 08:10:53 GMT
On Tue, 2011-07-05 at 17:50 +0200, Hiller, Dean x66079 wrote:
> We are using a sort of nosql environment and deleting 200 gig on one machine from the
database is fast, but then we go and delete 5 gigs of indexes that were created and it takes
forever!!!!

8 million indexes is at a minimum 16 (24?) million files. If you are
using a conventional harddisk for that, then yes, it takes forever.
SSD is the answer to that problem, but then again, SSD is the answer to
most IO-performance problems.

Just a quick sanity check: I hope you are not storing the individual
index folders under the same root folder? If you have
indexes/index0000001/
indexes/index0000002/
indexes/index0000003/
...
indexes/index8000000/
in the same folder, you are asking for trouble since most file systems
don't perform well with folders with millions of entries. If that is the
case, split them in sub folders for every X order of magnitude, such as
indexes/000/000/index0000001/
indexes/000/000/index0000002/
indexes/000/000/index0000003/
...
indexes/000/500/index0500001/
...
indexes/008/000/index8000000/


Having that many tiny indexes sets off an alarm bell for me. That's
quite a special use of Lucene you've got going.

> Is there any option in lucene to make it so it uses LARGER files and less count of files
so it is easier to maintain and wipe out an index much faster?

Use compound files, optimize to single segment. If I understand
correctly, your indexes are tiny, so this should not give any noticeable
performance hit.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message