lucene-solr-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Per Steffensen <st...@designware.dk>
Subject Re: Storing/indexing speed drops quickly
Date Thu, 12 Sep 2013 07:50:36 GMT
Maybe the fact that we are never ever going to delete or update 
documents, can be used for something. If we delete we will delete entire 
collections.

Regards, Per Steffensen

On 9/12/13 8:25 AM, Per Steffensen wrote:
> Hi
>
> SolrCloud 4.0: 6 machines, quadcore, 8GB ram, 1T disk, one Solr-node 
> on each, one collection across the 6 nodes, 4 shards per node
> Storing/indexing from 100 threads on external machines, each thread 
> one doc at the time, full speed (they always have a new doc to 
> store/index)
> See attached images
> * iowait.png: Measured I/O wait on the Solr machines
> * doccount.png: Measured number of doc in Solr collection
>
> Starting from an empty collection. Things are fine wrt 
> storing/indexing speed for the first two-three hours (100M docs per 
> hour), then speed goes down dramatically, to an, for us, unacceptable 
> level (max 10M per hour). At the same time as speed goes down, we see 
> that I/O wait increases dramatically. I am not 100% sure, but quick 
> investigation has shown that this is due to almost constant merging.
>
> What to do about this problem?
> Know that you can play around with mergeFactor and commit-rate, but 
> earlier tests shows that this really do not seem to do the job - it 
> might postpone the time where the problem occurs, but basically it is 
> just a matter of time before merging exhaust the system.
> Is there a way to totally avoid merging, and keep indexing speed at a 
> high level, while still making sure that searches will perform fairly 
> well when data-amounts become big? (guess without merging you will end 
> up with lots and lots of "small" files, and I guess this is not good 
> for search response-time)
>
> Regards, Per Steffensen


Mime
View raw message