lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Charles Wardell <>
Subject Question on Batch process
Date Tue, 26 Apr 2011 18:32:29 GMT
I am sure that this question has been asked a few times, but I can't seem to find the sweetspot
for indexing.

I have about 100,000 files each containing 1,000 xml documents ready to be posted to Solr.
My desire is to have it index as quickly as possible and then once completed the daily stream
of ADDs will be small in comparison.

The individual documents are small. Essentially web postings from the net. Title, postPostContent,

What would be the ideal configuration? For RamBufferSize, mergeFactor, MaxbufferedDocs, etc..

My machine is a quad core hyper-threaded. So it shows up as 8 cpu's in TOP
I have 16GB of available ram.

Thanks in advance.
View raw message