lucene-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Otis Gospodnetic <>
Subject Re: how to control the disk size of the indices
Date Tue, 25 Mar 2008 01:34:48 GMT
Hi Yannis,

I don't think there is anything of that sort in Lucene, but this shouldn't be hard to do with
a process outside Lucene.  Of course. optimizing an index increases its size temporarily,
so your external process would have to take that into account and play it safe.  You could
also set mergeFactor to 1, which should keep your index in a fully optimized state if you
don't do any deletions and near-optimized state if you do deletions.

You should discuss this on java-user list, though, so I'm CCing that list where you can continue
the discussion.

Sematext -- -- Lucene - Solr - Nutch

----- Original Message ----
From: Yannis Pavlidis <>
Sent: Monday, March 24, 2008 7:33:26 PM
Subject: how to control the disk size of the indices

Hi all,

I wanted to ask the list whether there is an easy and efficient way to manage the size (in
bytes) of a lucene index stored on disk.

Basically I would like to limit lucene storing only 100 GB of information. When lucene reaches
that limit then I would delete the documents (using an LRU algorithm based on timestaps) but
in no case the disk space occupied by Lucene should exceed 100GB.

I experimented with lucene 2.3.1 and the only I could accomplish that was by calling the optimize
method (after the index size exceeded the max size) on the IndexWriter. I was looking for
a more performant way to "perhaps control" Lucene on when to merge the segments so as to not
exceed the pre-set limit.

Any ideas or suggestions would be highly appreciated.

Thanks in advance,


View raw message