accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <>
Subject Running Continuous Ingest on small cluster
Date Thu, 20 Dec 2012 01:10:06 GMT
In playing around with the continuous ingest collection of code (ingest, 
walkers, batchwalkers, scanners and agitators), I found myself blindly 
guessing at how many of each of these processes I should use.

Are there some generic thoughts as to what might be an ideal saturation 
point for N tservers?

I initially split my hosts 4 ways and ran (N/4) of each process (ingest, 
walkers, batchwalkers, and scanners), ratcheting down the number of 
threads ingest and batchwalkers (to avoid saturating CPU and memory). 
Should I try to balance (query threads * query clients) + (ingest 
threads * ingest clients) against the available threads per host and 
adjust the BatchWriter send buffers similarly in regard to memory available?

I appreciate anyone's insight.

- Josh

View raw message