accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Hulbert <ahulb...@ccri.com>
Subject Tweaking non-bulk Ingest Performance
Date Wed, 14 Oct 2015 16:56:42 GMT
Hi all,

I've been attempting to improve a streaming ingest client into Accumulo 
and have been playing with a few of the following settings:

tserver.memory.maps.max (and in tandem 
table.compaction.minor.logs.threshold and tserver.wal.blocksize)
tserver.mutation.queue.max

In one set of tests i stood up ~200 batch writers and wrote approx 250M 
tweets into a couple of different index schemas. What I've noticed is 
that increasing the tserver.memory.maps.max from 1G to 2G or 4G actually 
slows down my ingest rate. Cutting it to 512M forced lots of compactions 
and high server load but a faster ingest.

I attached a screen shot of the two ingests (the 
tserver.mutation.queue.max=4G in green) (33 nodes, -Xmx26G, 8 CPU, 4 SSDs)

My question is whether anyone has done any performance tweaking for 
non-bulk ingest on a cluster and understands why that'd be the case? 
I've read through all the docs/etc but haven't found a consistent 
methodology for tweaking params...so I was wondering if anyone else had 
attempted to tune a cluster like this.

Thanks for any ideas!

Andrew

Mime
View raw message