lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Michael McCandless <luc...@mikemccandless.com>
Subject Re: Lucene Merge failing on Open Files
Date Tue, 05 Apr 2011 09:46:13 GMT
Yeah, that mergeFactor is way too high and will cause
too-many-open-files (if the index has enough segments).

Also, you should setRamBufferSizeMB instead of maxBufferedDocs, for
faster index throughput.

Calling optimize from two threads doesn't help it run faster when
using ConcurrentMergeScheduler (the default).  Ie with CMS, optimize
simply waits for CMS to perform all the necessary merges.

Mike

http://blog.mikemccandless.com

On Mon, Apr 4, 2011 at 4:06 PM, Simon Willnauer
<simon.willnauer@googlemail.com> wrote:
> On Mon, Apr 4, 2011 at 9:59 PM, Paul Taylor <paul_t100@fastmail.fm> wrote:
>> On 04/04/2011 20:13, Michael McCandless wrote:
>>>
>>> How are you merging these indices?  (IW.addIndexes?).
>>>
>>> Are you changing any of IW's defaults, eg mergeFactor?
>>>
>>> Mike
>>
>> Hi Mike
>>
>> I have
>>
>> indexWriter.setMaxBufferedDocs(10000);
>> indexWriter.setMergeFactor(3000);
>>
>
> I didn't read though the entire email but MergeFactor 3000 doesn't
> look right at all. You should try something between 5 and 30. I
> haven't seen a mergeFactor > 50 doing any good in a common env. Why do
> you use such a large factor?
>
> simon
>> these are a hangover from earlier code, I tried changing them and it didnt
>> seem to make any difference, but do they look wrong ?
>>
>> The index that falls over during optmizationcreates about 10,000,000 records
>>
>> What I do is build 10 different indexes sequentially one after the after (so
>> only one is hitting the db at any one time) but once the index is built I
>> optimize the index in the background
>> by creating an instance of the following class and submitting it to  an
>> ExecutionService , configured to have at most 2 threads acting
>>
>> I built my own class rather than just using optimize() with the background
>> option because that wouldnt allow me to do the necessary
>> debugging/calculations
>>
>>
>> static class IndexWriterOptimizerAndClose implements Callable<Boolean>
>>    {
>>        private int             maxId;
>>        private IndexWriter     indexWriter;
>>        private DatabaseIndex   index;
>>        private IndexOptions    options;
>>
>>        /**
>>         *
>>         * @param maxId
>>         * @param indexWriter
>>         * @param index
>>         * @param options
>>         */
>>        public IndexWriterOptimizerAndClose(int maxId, IndexWriter
>> indexWriter, DatabaseIndex index, IndexOptions options)
>>        {
>>            this.maxId=maxId;
>>            this.indexWriter= indexWriter;
>>            this.index=index;
>>            this.options=options;
>>        }
>>
>>        public Boolean call() throws IOException, SQLException
>>        {
>>
>>            StopWatch clock = new StopWatch();
>>            clock.start();
>>            String path = options.getIndexesDir() + index.getFilename();
>>            System.out.println(index.getName()+":Started Optimization at
>> "+Utils.formatCurrentTimeForOutput());
>>
>>            indexWriter.optimize();
>>            indexWriter.close();
>>            clock.stop();
>>            // For debugging to check sql is not creating too few/many rows
>>            if(true) {
>>                int dbRows = index.getNoOfRows(maxId);
>>                IndexReader reader = IndexReader.open(FSDirectory.open(new
>> File(path)),true);
>>                System.out.println(index.getName()+":"+dbRows+" db
>> rows:"+(reader.maxDoc() - 1)+" lucene docs");
>>                reader.close();
>>            }
>>            System.out.println(index.getName()+":Finished Optimization:" +
>> Utils.formatClock(clock));
>>
>>            return true;
>>        }
>>    }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message