accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Peter Tillotson <slatem...@yahoo.co.uk>
Subject Re: WAL - rate limiting factor x4.67
Date Wed, 04 Dec 2013 16:35:42 GMT
I've 3 tables, each with a BatchWriter splitting 16M buffers across 8 threads. So up to 24
peak concurrent write threads, normally of order 10 actually concurrent. I'm not too worried
for the moment increasing mutation.queue.max feels like an unsustainable workaround so I'm
better off dumping my own walog in the app layer, it has other benefits too in terms of easy
replay etc. 

Don't know if it plateaued - if I get a chance I may take a look, but still given the above
I'm not a fan of that approach. 

Re no compression sending data between nodes -  I know, I was looking to see if I could make
the snappy approach work nicely with zfs dedup. I suspect it is a case of fine tuning io.file.buffer.size,
and table.file.compress.blocksize, and ZFS recordsize (128K), if the compressed block is
too big the likelihood of dupes goes down and the benefit in merge vanish. Too small and minimal
compression. 

Thanks again all. 




On Wednesday, 4 December 2013, 16:09, Keith Turner <keith@deenlo.com> wrote:
 
How many concurrent writers do you have?  I made some other comments below inline.




On Wed, Dec 4, 2013 at 10:53 AM, Peter Tillotson <slatemine@yahoo.co.uk> wrote:

Keith
>
>
>I tried tserver.mutation.queue.max=4M and it improved but by no where near a significant
difference. I my app records get turned into multiple Accumulo rows. 
>
>
>So in terms of my record write rate. 
>
>
>wal=true  & mutation.queue.max = 256K    |   ~8K records/s
>wal=true & mutation.queue.max = 4M        |   ~14K records/s 

Do you know if its plateaued?  If you increase this further (like 8M), is the rate the same? 
 
wal=false                                                 |   ~25K
records/s
>
>
>Adam, 
>
>
>Its one box so replication is off, good thought tnx. 
>
>
>BTW - I've been plying around with ZFS compression vs Accumulo Snappy. What I've found
was quite interesting. The idea was that with ZFS dedup and being in charge of compression
I'd get a boost later on when blocks merge. What I've found is that after a while with ZFS
LZ4 the CPU and disk all tail off, as though timeouts are elapsing somewhere whereas SNAPPY
maintains an average ~20k+. 

W/ this strategy the data will not be compressed when going between the tserver and datanode
OR the datanode and OS.  
 

>
>Anyway tnx and if I get a chance I may the 1.7 branch for the fix.

Nothing was done in 1.7 for this issue yet.
 
              
>
>
>
>
>On Wednesday, 4 December 2013, 14:56, Adam Fuchs <afuchs@apache.org> wrote:
> 
>One thing you can do is reduce the replication factor for the WAL. We have found that
makes a pretty significant different in write performance. That can be modified with the tserver.wal.replication
property. Setting it to 2 instead of the default (probably 3) should give you some performance
improvement, of course at some cost to durability. 
>
>
>Adam
>
>
>
>On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson <slatemine@yahoo.co.uk> wrote:
>
>I've been trying to get the most out of streaming data into Accumulo 1.5 (Hadoop Cloudera
CDH4). Having tried a number of settings, re-writing client code etc I finally switched off
the Write Ahead Log (table.walog.enabled=false) and saw a huge leap in ingest performance. 
>>
>>
>>Ingest with table.walog.enabled= true:   ~6 MB/s
>>Ingest with table.walog.enabled= false:  ~28 MB/s
>>
>>
>>
>>That is a factor of about x4.67 speed improvement. 
>>
>>
>>Now my use case could probably live without or work around not having a wal, but I
wondered if this was a known issue?? 
>>(didn't see anything in jira), wal seem to be a significant rate limiter this is either
endemic to Accumulo or an HDFS / setup issue. Though given everything is in HDFS these days
and otherwise IO flies it looks like Accumulo WAL is the most likely culprit.   
>>
>>
>>I don't believe this to be an IO issue on the box, with wal off the is significantly
more IO (up to 80M/s reported by dstat), with wal on (up to 12M/s reported by dstat). Testing
the box with FIO sequential write is 160M/s. 
>>
>>
>>Further info: 
>>Hadoop 2.00 (Cloudera cdh4)
>>Accumulo (1.5.0)
>>Zookeeper ( with Netty, minor improvement of <1MB/s  )
>>Filesystem ( HDFS is ZFS, compression=on, dedup=on, otherwise ext4 )
>>
>>
>>With large imports from scratch now I start off CPU bound and as more shuffling is
needed this becomes Disk bound later in the import as expected. So I know pre-splitting would
probably sort it.
>>
>>
>>Tnx 
>>
>>
>>P
>
>
>
Mime
View raw message