Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 77CC89F29 for ; Sat, 13 Dec 2014 14:12:46 +0000 (UTC) Received: (qmail 95334 invoked by uid 500); 13 Dec 2014 14:12:43 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 95295 invoked by uid 500); 13 Dec 2014 14:12:43 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 95285 invoked by uid 99); 13 Dec 2014 14:12:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Dec 2014 14:12:43 +0000 X-ASF-Spam-Status: No, hits=4.0 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS,T_REMOTE_IMAGE,URIBL_DBL_SPAM X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of rsvihla@datastax.com designates 209.85.213.43 as permitted sender) Received: from [209.85.213.43] (HELO mail-yh0-f43.google.com) (209.85.213.43) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 13 Dec 2014 14:12:17 +0000 Received: by mail-yh0-f43.google.com with SMTP id z6so4025997yhz.2 for ; Sat, 13 Dec 2014 06:12:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=datastax.com; s=google; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=KlQwqdWBmHw26Bnxasm5apBXvl0RLaaC2BqVT2cd38s=; b=Tio0X5cWZPTs2F9NpUYIaOidN++FUYXQte0cmWMAjiI+91dYEU7GUBehdqXSN1LRQ+ EIjOclFPD/2XrDiFl3mSsAmcNnPhdu4PrtA53CkShHlWvpC+7QJNJHF2umKzZQu8qowy ovruVktGCYVEo/asPB4N2yhIlLX4nwbPrWh/k= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:content-type; bh=KlQwqdWBmHw26Bnxasm5apBXvl0RLaaC2BqVT2cd38s=; b=iPoKfVk2Y+RbqGdA4nunzgnodpRS+pN2TZN3YCxPsAqWP1EqK3ER6d4LEN6mhuIkvF nQ4yIP43N8s+ZCi0BbDGIeh0QVGNlE7i+/fn1dhtRUEBsnACTXRNwXDSaMgx9DlR2g9F 6Eu/Qt8NsJDpf3W8iSADnpkJH+W01Qm4U3vl5jmFveik9WwDVXPmdkyBJMPw15i8Ag3R qeQ7NO6qtyVYUTi9mv5yMoSb8+wTZ3DcTz/v3TKR7ksdmeJk/0j0be+FpnJsGe9DUud8 DaJNCmpvnmT6jo6vFsEa7jUyHLeyF5hDN41LnV8PlNa+pskgndI2Ra7VQnVBmJyZ2rul cQvw== X-Gm-Message-State: ALoCoQn1Qhwa1bRrYy4V1rYJf/q/tWC1a7mppMU+Ewqd88aQGP11ruHrQNr9aecGtN86YUskiamR MIME-Version: 1.0 X-Received: by 10.236.198.147 with SMTP id v19mr16107579yhn.54.1418479935603; Sat, 13 Dec 2014 06:12:15 -0800 (PST) Received: by 10.170.216.2 with HTTP; Sat, 13 Dec 2014 06:12:15 -0800 (PST) In-Reply-To: References: <045D8FD556C73347A47F956EE65F8220185546E7@S11MAILD013N2.sh11.lan> Date: Sat, 13 Dec 2014 08:12:15 -0600 Message-ID: Subject: Re: batch_size_warn_threshold_in_kb From: Ryan Svihla To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=089e0160a3e0aedde3050a19988e X-Virus-Checked: Checked by ClamAV on apache.org --089e0160a3e0aedde3050a19988e Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Are batches to the same partition key (which results in a single mutation, and obviously eliminates the primary problem)? Is your client network and/or CPU bound? Remember, the coordinator node is _just_ doing what your client is doing with executeAsync, only now it's dealing with the heap pressure of compaction and flush writers, while youre client is busy writing code. Not trying to be argumentative, but I talk to the driver writers almost daily, and I've moved a lot of customers off batches and every single one of them sped up things substantially, that experience plus the theory leads me to believe there is a bottleneck on your client. Final point the more you grow your cluster the more the cost of losing token awareness in all writes in the batch grows On Sat, Dec 13, 2014 at 7:32 AM, Eric Stevens wrote: > > Jon, > > > The really important thing to really take away from Ryan's original > post is that batches are not there for performance. > > tl;dr: you probably don't want batch, you most likely want many async > calls > > My own rudimentary testing does not bear this out - at least not if you > mean to say that batches don't offer a performance advantage (vs this jus= t > being a happy side effect). Unlogged batches provide a substantial > improvement on performance for burst writes in my findings. > > My test setup: > > - Amazon i2.8xl instances in 3 AZ's using EC2Snitch > - Cluster size of 3, RF=3D3 > - DataStax Java Driver, with token aware routing, using Prepared > Statements, vs Unlogged Batches of Prepared Statements. > - Test client on separate machine in same AZ as one of the server node= s > - Data Size: 50,000 records > - Test Runs: 25 (unique data generated before each run) > - Data written to 5 tables, one table at a time (all 500k records go > to each table) > - Timing begins when first record is written to a table and ends when > the last async call completes for that table. Timing is measured > independently for each strategy, table, and run. > - To eliminate bias, order between tables is randomized on each run, > and order between single vs batched execution is randomized on each ru= n. > - Asynchronicity is tested using three different typical Scala > parallelism strategies. > - "traverse" =3D Futures.traverse(statements).map(_.executeAsync())= - > let the Futures system schedule the parallelism it thinks is approp= riate > - "scatter" =3D Futures.sequence(statements.map(_.executeAsync())) = - > Create as many async calls as possible at a time, then let the Futu= res > system gather together the results > - "parallel" =3D statements.par.map(_.execute()) - using a parallel > collection to initiate as many blocking calls as possible within th= e > default thread pool. > - I kept an eye on compaction throughout, and we never went above 2 > pending compaction tasks > > I know this test is fairly contrived, but it's difficult to dismiss a > throughput differences of this magnitude over several million data points= . > Times are in nanos. > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) as single > statements using strategy scatter > Total Run Time > test3 ((aid, bckt), end, proto) reverse order =3D > 51,391,100,107 > test1 ((aid, bckt), proto, end) reverse order =3D > 52,206,907,605 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 53,903,886,095 > test2 ((aid, bckt), end) =3D > 54,613,620,320 > test5 ((aid, bckt, end)) =3D > 55,820,739,557 > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) in batches > of 100 using strategy scatter > Total Run Time > test3 ((aid, bckt), end, proto) reverse order =3D > 9,199,579,182 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 11,661,638,491 > test2 ((aid, bckt), end) =3D > 12,059,853,548 > test1 ((aid, bckt), proto, end) reverse order =3D > 12,957,113,345 > test5 ((aid, bckt, end)) =3D > 31,166,071,275 > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) as single > statements using strategy traverse > Total Run Time > test1 ((aid, bckt), proto, end) reverse order =3D > 52,368,815,408 > test2 ((aid, bckt), end) =3D > 52,676,830,110 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 54,096,838,258 > test5 ((aid, bckt, end)) =3D > 54,657,464,976 > test3 ((aid, bckt), end, proto) reverse order =3D > 55,668,202,827 > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) in batches > of 100 using strategy traverse > Total Run Time > test3 ((aid, bckt), end, proto) reverse order =3D > 9,633,141,094 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 12,519,381,544 > test2 ((aid, bckt), end) =3D > 12,653,843,637 > test1 ((aid, bckt), proto, end) reverse order =3D > 17,644,182,274 > test5 ((aid, bckt, end)) =3D > 27,902,501,534 > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) as single > statements using strategy parallel > Total Run Time > test1 ((aid, bckt), proto, end) reverse order =3D > 360,523,086,443 > test3 ((aid, bckt), end, proto) reverse order =3D > 364,375,212,413 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 370,989,615,452 > test2 ((aid, bckt), end) =3D > 378,368,728,469 > test5 ((aid, bckt, end)) =3D > 380,737,675,612 > > =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D > 25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) in batches > of 100 using strategy parallel > Total Run Time > test3 ((aid, bckt), end, proto) reverse order =3D > 20,971,045,814 > test1 ((aid, bckt), proto, end) reverse order =3D > 21,379,583,690 > test4 ((aid, bckt), proto, end) no explicit ordering =3D > 21,505,965,087 > test2 ((aid, bckt), end) =3D > 24,433,580,144 > test5 ((aid, bckt, end)) =3D > 37,346,062,553 > > > > On Fri Dec 12 2014 at 11:00:12 AM Jonathan Haddad > wrote: > >> The really important thing to really take away from Ryan's original post >> is that batches are not there for performance. The only case I consider >> batches to be useful for is when you absolutely need to know that severa= l >> tables all get a mutation (via logged batches). The use case for this i= s >> when you've got multiple tables that are serving as different views for >> data. It is absolutely not going to help you if you're trying to lump >> queries together to reduce network & server overhead - in fact it'll do = the >> opposite. If you're trying to do that, instead perform many async >> queries. The overhead of batches in cassandra is significant and you're >> going to hit a lot of problems if you use them excessively (timeouts / >> failures). >> >> tl;dr: you probably don't want batch, you most likely want many async >> calls >> >> >> On Thu Dec 11 2014 at 11:15:00 PM Mohammed Guller >> wrote: >> >>> Ryan, >>> >>> Thanks for the quick response. >>> >>> >>> >>> I did see that jira before posting my question on this list. However, I >>> didn=E2=80=99t see any information about why 5kb+ data will cause insta= bility. 5kb >>> or even 50kb seems too small. For example, if each mutation is 1000+ by= tes, >>> then with just 5 mutations, you will hit that threshold. >>> >>> >>> >>> In addition, Patrick is saying that he does not recommend more than 100 >>> mutations per batch. So why not warn users just on the # of mutations i= n a >>> batch? >>> >>> >>> >>> Mohammed >>> >>> >>> >>> *From:* Ryan Svihla [mailto:rsvihla@datastax.com] >>> *Sent:* Thursday, December 11, 2014 12:56 PM >>> *To:* user@cassandra.apache.org >>> *Subject:* Re: batch_size_warn_threshold_in_kb >>> >>> >>> >>> Nothing magic, just put in there based on experience. You can find the >>> story behind the original recommendation here >>> >>> >>> >>> https://issues.apache.org/jira/browse/CASSANDRA-6487 >>> >>> >>> >>> Key reasoning for the desire comes from Patrick McFadden: >>> >>> >>> "Yes that was in bytes. Just in my own experience, I don't recommend >>> more than ~100 mutations per batch. Doing some quick math I came up wit= h 5k >>> as 100 x 50 byte mutations. >>> >>> Totally up for debate." >>> >>> >>> >>> It's totally changeable, however, it's there in no small part because s= o >>> many people confuse the BATCH keyword as a performance optimization, th= is >>> helps flag those cases of misuse. >>> >>> >>> >>> On Thu, Dec 11, 2014 at 2:43 PM, Mohammed Guller >>> wrote: >>> >>> Hi =E2=80=93 >>> >>> The cassandra.yaml file has property called *batch_size_warn_threshold_= in_kb. >>> * >>> >>> The default size is 5kb and according to the comments in the yaml file, >>> it is used to log WARN on any batch size exceeding this value in kiloby= tes. >>> It says caution should be taken on increasing the size of this threshol= d as >>> it can lead to node instability. >>> >>> >>> >>> Does anybody know the significance of this magic number 5kb? Why would = a >>> higher number (say 10kb) lead to node instability? >>> >>> >>> >>> Mohammed >>> >>> >>> >>> >>> -- >>> >>> [image: datastax_logo.png] >>> >>> Ryan Svihla >>> >>> Solution Architect >>> >>> >>> [image: twitter.png] [image: linkedin.png] >>> >>> >>> >>> >>> DataStax is the fastest, most scalable distributed database technology, >>> delivering Apache Cassandra to the world=E2=80=99s most innovative ente= rprises. >>> Datastax is built to be agile, always-on, and predictably scalable to a= ny >>> size. With more than 500 customers in 45 countries, DataStax is the >>> database technology and transactional backbone of choice for the worlds >>> most innovative companies such as Netflix, Adobe, Intuit, and eBay. >>> >>> >>> >> --=20 [image: datastax_logo.png] Ryan Svihla Solution Architect [image: twitter.png] [image: linkedin.png] DataStax is the fastest, most scalable distributed database technology, delivering Apache Cassandra to the world=E2=80=99s most innovative enterpri= ses. Datastax is built to be agile, always-on, and predictably scalable to any size. With more than 500 customers in 45 countries, DataStax is the database technology and transactional backbone of choice for the worlds most innovative companies such as Netflix, Adobe, Intuit, and eBay. --089e0160a3e0aedde3050a19988e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Are batches to the same partition key (which results = in a single mutation, and obviously eliminates the primary problem)? Is you= r client network and/or CPU bound?

Remember, the coordinator node is= _just_ doing what your client is doing with executeAsync, only now it'= s dealing with the heap pressure of compaction and flush writers, while you= re client is busy writing code.

Not trying to be argumentative, bu= t I talk to the driver writers almost daily, and I've moved a lot of cu= stomers off=20 batches and every single one of them sped up things substantially, that exp= erience plus the theory leads me to believe there is a bottleneck on your c= lient. Final point the more you grow your cluster the more the cost of losi= ng token awareness in all writes in the batch grows


On Sat, Dec 13, 2014 = at 7:32 AM, Eric Stevens <mightye@gmail.com> wrote:
Jon,

=
>=C2=A0The really important thing to= really take away from Ryan's original post is that batches are not the= re for performance.=C2=A0=C2=A0
> tl;dr: you probably don't want= batch, you most likely want many async calls

My own rudimentary testing does not bear this out - at least not if you = mean to say that batches don't offer a performance advantage (vs this j= ust being a happy side effect).=C2=A0 Unlogged batches provide a substantia= l improvement on performance for burst writes in my findings.
My test setup:
  • Amazon i2.8xl instances in 3 A= Z's using EC2Snitch
  • Cluster size of 3, RF=3D3
  • DataStax = Java Driver, with token aware routing, using Prepared Statements, vs Unlogg= ed Batches of Prepared Statements.
  • Test client on separate machine = in same AZ as one of the server nodes
  • Data Size: 50,000 records
  • Test Runs: 25 (unique data generated before each run)
  • Data wri= tten to 5 tables, one table at a time (all 500k records go to each table)
  • Timing begins when first record is written to a table and ends when = the last async call completes for that table.=C2=A0 Timing is measured inde= pendently for each strategy, table, and run.
  • To eliminate bias, ord= er between tables is randomized on each run, and order between single vs ba= tched execution is randomized on each run.
  • Asynchronicity is tested= using three different typical Scala parallelism strategies. =C2=A0
  • "traverse" =3D Futures.traverse(statements).map(_.executeAsy= nc()) - let the Futures system schedule the parallelism it thinks is approp= riate
  • "scatter" =3D Futures.sequence(statements.map(_.exe= cuteAsync())) - Create as many async calls as possible at a time, then let = the Futures system gather together the results
  • "parallel"= =3D statements.par.map(_.execute()) - using a parallel collection to initi= ate as many blocking calls as possible within the default thread pool.
  • =
  • I kept an eye on compaction throughout, and we never went above 2 = pending compaction tasks
  • I know this test is fairly contrived= , but it's difficult to dismiss a throughput differences of this magnit= ude over several million data points.=C2=A0 Times are in nanos.
    =

    =3D=3D=3D=3D Execution Results for 2= 5 runs of 50000 records =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    2= 5 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) as single sta= tements using strategy scatter
    Total Run Time
    =C2=A0 = =C2=A0 =C2=A0 =C2=A0 test3 ((aid, bckt), end, proto) reverse order =C2=A0 = =C2=A0 =C2=A0 =C2=A0=3D 51,391,100,107
    =C2=A0 =C2=A0 =C2=A0 =C2= =A0 test1 ((aid, bckt), proto, end) reverse order =C2=A0 =C2=A0 =C2=A0 =C2= =A0=3D 52,206,907,605
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test4 ((aid, bc= kt), proto, end) no explicit ordering =3D 53,903,886,095
    =C2=A0 = =C2=A0 =C2=A0 =C2=A0 test2 ((aid, bckt), end) =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 5= 4,613,620,320
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test5 ((aid, bckt, end)= ) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =3D 55,820,739,557

    =3D=3D= =3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D
    25 runs of 50,000 records (3 protos, 5 agents,= ~15 per bucket) in batches of 100 using strategy scatter
    Total R= un Time
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test3 ((aid, bckt), end, prot= o) reverse order =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 9,199,579,182
    =C2= =A0 =C2=A0 =C2=A0 =C2=A0 test4 ((aid, bckt), proto, end) no explicit orderi= ng =3D 11,661,638,491
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test2 ((aid, bc= kt), end) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 12,059,853,548
    =C2=A0 =C2= =A0 =C2=A0 =C2=A0 test1 ((aid, bckt), proto, end) reverse order =C2=A0 =C2= =A0 =C2=A0 =C2=A0=3D 12,957,113,345
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 t= est5 ((aid, bckt, end)) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 31,166,071,275

    =3D=3D=3D=3D Execution Results for 25 runs of 50000 recor= ds =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    25 runs of 50,000 reco= rds (3 protos, 5 agents, ~15 per bucket) as single statements using strateg= y traverse
    Total Run Time
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 t= est1 ((aid, bckt), proto, end) reverse order =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D= 52,368,815,408
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test2 ((aid, bckt), e= nd) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 52,676,830,110
    =C2=A0 =C2=A0 =C2= =A0 =C2=A0 test4 ((aid, bckt), proto, end) no explicit ordering =3D 54,096,= 838,258
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test5 ((aid, bckt, end)) =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =3D 54,657,464,976
    =C2=A0 =C2=A0 =C2=A0 =C2= =A0 test3 ((aid, bckt), end, proto) reverse order =C2=A0 =C2=A0 =C2=A0 =C2= =A0=3D 55,668,202,827

    =3D=3D=3D=3D Execution Resul= ts for 25 runs of 50000 records =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    25 runs of 50,000 records (3 protos, 5 agents, ~15 per bucket) in ba= tches of 100 using strategy traverse
    Total Run Time
    =C2= =A0 =C2=A0 =C2=A0 =C2=A0 test3 ((aid, bckt), end, proto) reverse order =C2= =A0 =C2=A0 =C2=A0 =C2=A0=3D 9,633,141,094
    =C2=A0 =C2=A0 =C2=A0 = =C2=A0 test4 ((aid, bckt), proto, end) no explicit ordering =3D 12,519,381,= 544
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test2 ((aid, bckt), end) =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =3D 12,653,843,637
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 = test1 ((aid, bckt), proto, end) reverse order =C2=A0 =C2=A0 =C2=A0 =C2=A0= =3D 17,644,182,274
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test5 ((aid, bckt,= end)) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 27,902,501,534

    =3D=3D=3D=3D Execution Results for 25 runs of 50000 records =3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D
    25 runs of 50,000 records (3 protos, = 5 agents, ~15 per bucket) as single statements using strategy parallel
    Total Run Time
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test1 ((aid, bck= t), proto, end) reverse order =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 360,523,086,44= 3
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test3 ((aid, bckt), end, proto) rev= erse order =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 364,375,212,413
    =C2=A0 = =C2=A0 =C2=A0 =C2=A0 test4 ((aid, bckt), proto, end) no explicit ordering = =3D 370,989,615,452
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test2 ((aid, bckt= ), end) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 378,368,728,469
    =C2=A0 =C2=A0= =C2=A0 =C2=A0 test5 ((aid, bckt, end)) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 380,737,= 675,612

    =3D=3D=3D=3D Execution Results for 25 runs= of 50000 records =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
    25 runs= of 50,000 records (3 protos, 5 agents, ~15 per bucket) in batches of 100 u= sing strategy parallel
    Total Run Time
    =C2=A0 =C2=A0 =C2= =A0 =C2=A0 test3 ((aid, bckt), end, proto) reverse order =C2=A0 =C2=A0 =C2= =A0 =C2=A0=3D 20,971,045,814
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test1 ((= aid, bckt), proto, end) reverse order =C2=A0 =C2=A0 =C2=A0 =C2=A0=3D 21,379= ,583,690
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test4 ((aid, bckt), proto, e= nd) no explicit ordering =3D 21,505,965,087
    =C2=A0 =C2=A0 =C2=A0 = =C2=A0 test2 ((aid, bckt), end) =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =3D 24,433,580,144<= /div>
    =C2=A0 =C2=A0 =C2=A0 =C2=A0 test5 ((aid, bckt, end)) =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =3D 37,346,062,553
    <= div>


    On Fri Dec 12 2014 at 11:00:12 AM Jonathan Haddad <jon@jonhaddad.com> wrote:
    The really important thing to really take away from Ryan's= original post is that batches are not there for performance.=C2=A0 The onl= y case I consider batches to be useful for is when you absolutely need to k= now that several tables all get a mutation (via logged batches).=C2=A0 The = use case for this is when you've got multiple tables that are serving a= s different views for data.=C2=A0 It is absolutely not going to help you if= you're trying to lump queries together to reduce network & server = overhead - in fact it'll do the opposite.=C2=A0 If you're trying to= do that, instead perform many async queries.=C2=A0 The overhead of batches= in cassandra is significant and you're going to hit a lot of problems = if you use them excessively (timeouts / failures).

    tl;dr= : you probably don't want batch, you most likely want many async calls<= /div>


    On Thu Dec 11 2014 at 11:15:00= PM Mohammed Guller <mohammed@glassbeam.com> wrote:

    Ryan,

    Thanks for the quick response.=

    =C2=A0

    I did see that jira before posting my questi= on on this list. However, I didn=E2=80=99t see any information about why 5k= b+ data will cause instability. 5kb or even 50kb seems too small. For example, if each mutation is 1000+ bytes, then with j= ust 5 mutations, you will hit that threshold.

    =C2=A0

    In addition, Patrick is saying that he does = not recommend more than 100 mutations per batch. So why not warn users just= on the # of mutations in a batch?

    =C2=A0

    Mohammed

    =C2=A0

    From: Ryan Svihla [mailto:rsvihla@datastax.com]
    Sent: Thursday, December 11, 2014 12:56 PM
    To: u= ser@cassandra.apache.org
    Subject: Re: batch_size_warn_threshold_in_kb=

    =C2=A0

    Nothing magic, just put in there based on experience= . You can find the story behind the original recommendation here<= /u>

    =C2=A0

    =C2=A0

    Key reasoning for the desire comes from Patrick McFa= dden:


    "Yes that was in bytes. Just in my own experience, I don't recomme= nd more than ~100 mutations per batch. Doing some quick math I came up with= 5k as 100 x 50 byte mutations.

    Totally up for debate."

    =C2=A0

    It's totally changeable, however, it's there= in no small part because so many people confuse the BATCH keyword as a per= formance optimization, this helps flag those cases of misuse.=

    =C2=A0

    On Thu, Dec 11, 2014 at 2:43 PM, Mohammed Guller <= ;mohammed@glass= beam.com> wrote:

    Hi =E2=80=93

    The cassandra.yaml file has property called batch_size_warn_threshold_in_kb. =

    The default size is 5kb and according to the comments in t= he yaml file, it is used to log WARN on any batch size exceeding this value= in kilobytes. It says caution should be taken on increasing the size of this threshold as it can= lead to node instability.

    =C2=A0

    Does anybody know the significance of this magic num= ber 5kb? Why would a higher number (say 10kb) lead to node instability?<= /u>

    =C2=A0=

    Mohammed


    =C2=A0

    --

    <= /div>


    --

    Ryan Svihl= a

    Solution Architect


    3D"twitter.png" = 3D"linkedin.png"=

    DataStax is the fastest, most scalable distributed database technol= ogy, delivering Apache Cassandra to the world=E2=80=99s most innovative ent= erprises. Datastax is built to be agile, always-on, and predictably scalabl= e to any size. With more than 500 customers in 45 countries, DataStax is the database technology and tra= nsactional backbone of choice for the worlds most innovative companies such= as Netflix, Adobe, Intuit, and eBay.


    =
    --089e0160a3e0aedde3050a19988e--