Return-Path: X-Original-To: apmail-accumulo-user-archive@www.apache.org Delivered-To: apmail-accumulo-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D5B771069C for ; Wed, 4 Dec 2013 16:09:42 +0000 (UTC) Received: (qmail 751 invoked by uid 500); 4 Dec 2013 16:09:41 -0000 Delivered-To: apmail-accumulo-user-archive@accumulo.apache.org Received: (qmail 661 invoked by uid 500); 4 Dec 2013 16:09:39 -0000 Mailing-List: contact user-help@accumulo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@accumulo.apache.org Delivered-To: mailing list user@accumulo.apache.org Received: (qmail 647 invoked by uid 99); 4 Dec 2013 16:09:39 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 16:09:39 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: local policy includes SPF record at spf.trusted-forwarder.org) Received: from [209.85.128.172] (HELO mail-ve0-f172.google.com) (209.85.128.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 04 Dec 2013 16:09:33 +0000 Received: by mail-ve0-f172.google.com with SMTP id jw12so12379895veb.31 for ; Wed, 04 Dec 2013 08:09:12 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=eOh+lF16wkYi9jKynfOhmw9SeCdphffxN/Dl8E6lvao=; b=X7+kLqph+odjoSyjBmvPQhMjTgdeWDPhTMmICsZst25f27oTnGtS7k/JmgVTHmsRUk qKd9WNlhmmXrGqeauKH5/sZp3MXR74mTtYE04knX9xubUzQDMIa0OSsUhJZMxm0AOAnZ qoaWC93OPMIpC41Lu+pXVTbKBdrZTkvsdt5KzMezpf2PCCZAzxCwUtjYFbDAkhXHyAxz 8PIvjjM+xvp4ZAdldbXTKMDc5pWmSti78y7w7I5m7r09/JSvsOa3gRvOfBoWLWyUw7kW c5T6UPkukjxq3IXw2aboPBbI5S9thRSBiilY0Agn7jUeEP9FM0lxoblFHbcsdBBDJmLE j6fA== X-Gm-Message-State: ALoCoQlZ+PSR1Vflqxyjodd1MdA8Vtwcy07r2JOtK8eMziU01rJTki7Ngnx9cZGMx/Qo/aNuSUjg MIME-Version: 1.0 X-Received: by 10.58.67.9 with SMTP id j9mr59607609vet.3.1386173352359; Wed, 04 Dec 2013 08:09:12 -0800 (PST) Received: by 10.220.171.135 with HTTP; Wed, 4 Dec 2013 08:09:12 -0800 (PST) In-Reply-To: <1386172416.36495.YahooMailNeo@web171703.mail.ir2.yahoo.com> References: <1386152073.30919.YahooMailNeo@web171701.mail.ir2.yahoo.com> <1386172416.36495.YahooMailNeo@web171703.mail.ir2.yahoo.com> Date: Wed, 4 Dec 2013 11:09:12 -0500 Message-ID: Subject: Re: WAL - rate limiting factor x4.67 From: Keith Turner To: user@accumulo.apache.org, Peter Tillotson Content-Type: multipart/alternative; boundary=047d7b339e07445a6904ecb7a2dc X-Virus-Checked: Checked by ClamAV on apache.org --047d7b339e07445a6904ecb7a2dc Content-Type: text/plain; charset=ISO-8859-1 How many concurrent writers do you have? I made some other comments below inline. On Wed, Dec 4, 2013 at 10:53 AM, Peter Tillotson wrote: > Keith > > I tried tserver.mutation.queue.max=4M and it improved but by no where near > a significant difference. I my app records get turned into multiple > Accumulo rows. > > So in terms of my record write rate. > > wal=true & mutation.queue.max = 256K | ~8K records/s > wal=true & mutation.queue.max = 4M | ~14K records/s > Do you know if its plateaued? If you increase this further (like 8M), is the rate the same? > wal=false | ~25K > records/s > > Adam, > > Its one box so replication is off, good thought tnx. > > BTW - I've been plying around with ZFS compression vs Accumulo Snappy. > What I've found was quite interesting. The idea was that with ZFS dedup and > being in charge of compression I'd get a boost later on when blocks merge. > What I've found is that after a while with ZFS LZ4 the CPU and disk all > tail off, as though timeouts are elapsing somewhere whereas SNAPPY > maintains an average ~20k+. > W/ this strategy the data will not be compressed when going between the tserver and datanode OR the datanode and OS. > > Anyway tnx and if I get a chance I may the 1.7 branch for the fix. > Nothing was done in 1.7 for this issue yet. > > > > On Wednesday, 4 December 2013, 14:56, Adam Fuchs > wrote: > One thing you can do is reduce the replication factor for the WAL. We > have found that makes a pretty significant different in write performance. > That can be modified with the tserver.wal.replication property. Setting it > to 2 instead of the default (probably 3) should give you some performance > improvement, of course at some cost to durability. > > Adam > > > On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson wrote: > > I've been trying to get the most out of streaming data into Accumulo 1.5 > (Hadoop Cloudera CDH4). Having tried a number of settings, re-writing > client code etc I finally switched off the Write Ahead Log > (table.walog.enabled=false) and saw a huge leap in ingest performance. > > Ingest with table.walog.enabled= true: ~6 MB/s > Ingest with table.walog.enabled= false: ~28 MB/s > > That is a factor of about x4.67 speed improvement. > > Now my use case could probably live without or work around not having a > wal, but I wondered if this was a known issue?? > (didn't see anything in jira), wal seem to be a significant rate limiter > this is either endemic to Accumulo or an HDFS / setup issue. Though given > everything is in HDFS these days and otherwise IO flies it looks like > Accumulo WAL is the most likely culprit. > > I don't believe this to be an IO issue on the box, with wal off the is > significantly more IO (up to 80M/s reported by dstat), with wal on (up to > 12M/s reported by dstat). Testing the box with FIO sequential write is > 160M/s. > > Further info: > Hadoop 2.00 (Cloudera cdh4) > Accumulo (1.5.0) > Zookeeper ( with Netty, minor improvement of <1MB/s ) > Filesystem ( HDFS is ZFS, compression=on, dedup=on, otherwise ext4 ) > > With large imports from scratch now I start off CPU bound and as more > shuffling is needed this becomes Disk bound later in the import as > expected. So I know pre-splitting would probably sort it. > > Tnx > > P > > > > > --047d7b339e07445a6904ecb7a2dc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
How many concurrent writers do you have? =A0I made some ot= her comments below inline.


On Wed, Dec 4, 2013 at 10:53 AM, Peter Tillotson <sla= temine@yahoo.co.uk> wrote:
Keith

=
I tried=A0tserver.mutation.queue.max=3D4= M and it improved but by no where near a significant difference. I my app r= ecords get turned into multiple Accumulo rows.=A0

So in terms of my record write rate.=A0

wal=3Dtrue =A0&am= p; mutation.queue.max =3D 256K =A0 =A0| =A0 ~8K records/s
wal=3Dtru= e & mutation.queue.max =3D 4M =A0 =A0 =A0 =A0| =A0 ~14K records/s=A0

Do you know if its plate= aued? =A0If you increase this further (like 8M), is the rate the same?=A0
=A0
wal=3Dfalse =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 |=A0=A0=A0~25K records/s

Adam,=A0

Its one box so replication is = off, good thought tnx.=A0

BTW - I've been plying aro= und with ZFS compression vs Accumulo Snappy. What I've found was quite = interesting. The idea was that with ZFS dedup and being in charge of compre= ssion I'd get a boost later on when blocks merge. What I've found i= s that after a while with ZFS LZ4 the CPU and disk all tail off, as though = timeouts are elapsing somewhere whereas SNAPPY maintains an average ~20k+.= =A0

W/ this strategy the dat= a will not be compressed when going between the tserver and datanode OR the= datanode and OS. =A0
=A0

Anyway tnx and if I get a chan= ce I may the 1.7 branch for the fix.
<= div>
Nothing was done in 1.7 for this issue yet.
=A0
= =A0 =A0 =A0 =A0 =A0 =A0 =A0=A0


On Wednesday, 4 December 2013, 14:= 56, Adam Fuchs <a= fuchs@apache.org> wrote:
One thing you can do is reduce the replication factor for the WAL. We have = found that makes a pretty significant different in write performance. That = can be modified with the tserver.wal.replication property. Setting it to 2 = instead of the default (probably 3) should give you some performance improv= ement, of course at some cost to durability.=A0

Adam

=
On Wed, Dec 4, 2013 at 5:14 AM, Peter Tillotson <slatemine@yahoo.co.uk> wro= te:
=
I've been trying to get the most out of streaming data into Accumulo 1.= 5 (Hadoop Cloudera CDH4). Having tried a number of settings, re-writing cli= ent code etc I finally switched off the Write Ahead Log (table.walog.enable= d=3Dfalse) and saw a huge leap in ingest performance.=A0

Ingest with=A0ta= ble.walog.enabled=3D true: =A0 ~6 MB/s
Ingest with=A0table.walog.enabled=3D false: =A0~28 MB/s

That is a factor of about x4.67 speed improvement.=A0

Now my use case could= probably live without or work around not having a wal, but I wondered if this was a known issue??=A0

I don't believe this to be an IO issue on the box, with wal off the is significantly more IO (up to 80M/s reported by ds= tat), with wal on=A0(up to 12M/s reported by= dstat). Testing the box with FIO sequential write is 160M/s.=A0

Further info:= =A0
Hadoop 2.00 (Cloudera cdh4)
Accumulo (1.5.0)
Zookeeper ( wit= h Netty, minor improvement of <1MB/s =A0)
Filesystem ( HD= FS is ZFS, compression=3Don, dedup=3Don, otherwise ext4 )

With large impo= rts from scratch now I start off CPU bound and as more shuffling is needed = this becomes Disk bound later in the import as expected. So I know pre-spli= tting would probably sort it.

Tnx=A0

P<= /span>

<= br>
<= /div>
--047d7b339e07445a6904ecb7a2dc--