Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2FFE610644 for ; Wed, 19 Feb 2014 22:59:02 +0000 (UTC) Received: (qmail 88509 invoked by uid 500); 19 Feb 2014 22:58:59 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 88472 invoked by uid 500); 19 Feb 2014 22:58:58 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 88464 invoked by uid 99); 19 Feb 2014 22:58:58 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 22:58:58 +0000 X-ASF-Spam-Status: No, hits=2.4 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of ynerella999@gmail.com designates 209.85.192.171 as permitted sender) Received: from [209.85.192.171] (HELO mail-pd0-f171.google.com) (209.85.192.171) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 19 Feb 2014 22:58:52 +0000 Received: by mail-pd0-f171.google.com with SMTP id g10so983048pdj.30 for ; Wed, 19 Feb 2014 14:58:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=hYRClHp/GtHgFiYHOunsVoNvHT7pY9H/cS/iUHz82mI=; b=p6IaWhOXT3IiWkDNTffeuC6p2EadkBhtC5+xo+PZKMsfuTdmhNTOQeY9Awhgsr0sbu ZZdRsTQUNWVgfoqNQmEt9JfjmcIAgv5Pr+9KbB6v1SSRcOqXetJ/ujmTNhxZQvGSE0xb /MQS7mmwlExuKyf3xeTfgqcl8md9hmd0NYq2VOfOcv14Ae/mN+e4TKJGpWfGrUoOxkVK OZRp27V2rDMuN7JfRmDd+aCLFhoKKew8HDd9cBC3L0aR2GeV80jJUR5KI89BQzWUEojJ WnsaGAuxKNkpkfCGzKA2WYfp1NFBqQnKoKSNzpC2UK89caAaxFPe89KS67M5dPMvcLOD yCFw== MIME-Version: 1.0 X-Received: by 10.68.171.99 with SMTP id at3mr5159591pbc.109.1392850710617; Wed, 19 Feb 2014 14:58:30 -0800 (PST) Received: by 10.68.19.201 with HTTP; Wed, 19 Feb 2014 14:58:30 -0800 (PST) In-Reply-To: References: Date: Wed, 19 Feb 2014 14:58:30 -0800 Message-ID: Subject: Re: Performance problem with large wide row inserts using CQL From: Yogi Nerella To: user@cassandra.apache.org Content-Type: multipart/alternative; boundary=047d7bacbad8d4e2ea04f2ca5307 X-Virus-Checked: Checked by ClamAV on apache.org --047d7bacbad8d4e2ea04f2ca5307 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable R=FCdiger, I have tried CQL only, and it was failing after 127 records added. I have to check what is wrong. I have the keyspace and table definiton exactly as you. I am new to scala, I do not know how to do this. I may try this in the evening. Yogi On Wed, Feb 19, 2014 at 2:50 PM, R=FCdiger Klaehn wrote= : > This must be something to do with server side validation. If I define the > table like this it does not happen: > > cqlsh> CREATE KEYSPACE IF NOT EXISTS test1 WITH REPLICATION =3D { 'class= ' : > 'SimpleStrategy', 'replication_factor' : 1 }; > cqlsh> use test1; > cqlsh:test1> create TABLE employees2 (time blob, name blob, value blob, > PRIMARY KEY (name, value)) WITH COMPACT STORAGE; > > I should really update the AstClient benchmark so that it creates the > table itself. But as I said I am not very familiar with astyanax (or with > cassandra in general for that matter). It seems that I just happened to h= it > an issue the the first time I was playing with it... > > cheers, > > R=FCdiger > > p.s. Did you also have a large performance difference between the Thrift > and the CQL? > > On Wed, Feb 19, 2014 at 10:57 PM, Yogi Nerella wro= te: > >> I have a two node cluster. Tried with both 2.0.4 and 2.0.5. >> I have tried your code, and exactly after inserting 127 rows, the next >> insert fails. >> >> 10.566482102276002 123 >> 2.7760618708015863 124 >> 8.936212688296054 125 >> 9.532923906962095 126 >> 7.5081516753554505 127 >> java.lang.RuntimeException: failed to write data to C* >> at demo.AstClient.insert(AstClient.java:75) >> at demo.AstClient.loadData(AstClient.java:112) >> at demo.AstClient.main(AstClient.java:127) >> Caused by: >> com.netflix.astyanax.connectionpool.exceptions.BadRequestException: >> BadRequestException: [host=3D127.0.0.1(127.0.0.1):9160, latency=3D22(22)= , >> attempts=3D1]InvalidRequestException(why:String didn't validate.) >> at >> com.netflix.astyanax.thrift.ThriftConverter.ToConnectionPoolException(Th= riftConverter.java:159) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperat= ionImpl.java:61) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperat= ionImpl.java:28) >> at >> com.netflix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnec= tion.execute(ThriftSyncConnectionFactoryImpl.java:151) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl= .tryOperation(AbstractExecuteWithFailoverImpl.java:69) >> at >> com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnection= Pool.executeWithFailover(AbstractHostPartitionConnectionPool.java:256) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.executeOperation(ThriftKe= yspaceImpl.java:478) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspace= Impl.java:73) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.execute(ThriftKeyspaceI= mpl.java:116) >> at demo.AstClient.insert(AstClient.java:72) >> ... 2 more >> Caused by: InvalidRequestException(why:String didn't validate.) >> at >> org.apache.cassandra.thrift.Cassandra$batch_mutate_result.read(Cassandra= .java:20833) >> at >> org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78) >> at >> org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(Cassandra= .java:964) >> at >> org.apache.cassandra.thrift.Cassandra$Client.batch_mutate(Cassandra.java= :950) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(Thrif= tKeyspaceImpl.java:122) >> at >> com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.internalExecute(Thrif= tKeyspaceImpl.java:119) >> at >> com.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperat= ionImpl.java:56) >> ... 10 more >> >> >> >> On Wed, Feb 19, 2014 at 12:38 PM, R=FCdiger Klaehn wr= ote: >> >>> On Wed, Feb 19, 2014 at 7:49 PM, Sylvain Lebresne wrote: >>> >>>> On Wed, Feb 19, 2014 at 11:27 AM, R=FCdiger Klaehn = wrote: >>>> >>>>> >>>>> Am I doing something wrong, or is this a fundamental limitation of CQ= L. >>>>> >>>> >>>> Neither. I believe you are running into >>>> https://issues.apache.org/jira/browse/CASSANDRA-6737, which is a bug, >>>> a performance bug, which we should and will fix. So thanks for the rep= ort. >>>> If you could give a shot to the patch on that issue and check if it he= lps, >>>> that would definitively be much appreciated. >>>> >>>> Hi Sylvain, >>> >>> Yes, this issue looks almost identical to the problem I have been >>> experiencing. Great to hear that you are aware of the issue and that th= ere >>> is a fix. >>> >>> I have cloned the cassandra repo, applied the patch, and built it. But >>> when I want to run the bechmark I get an exception. See below. I tried = with >>> a non-managed dependency to >>> cassandra-driver-core-2.0.0-rc3-SNAPSHOT-jar-with-dependencies.jar, whi= ch I >>> compiled from source because I read that that might help. But that did = not >>> make a difference. >>> >>> So currently I don't know how to give the patch a try. Any ideas? >>> >>> cheers, >>> >>> R=FCdiger >>> >>> Exception in thread "main" java.lang.IllegalArgumentException: >>> replicate_on_write is not a column defined in this metadata >>> at >>> com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.= java:273) >>> at >>> com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnDefinition= s.java:279) >>> at com.datastax.driver.core.Row.getBool(Row.java:117) >>> at >>> com.datastax.driver.core.TableMetadata$Options.(TableMetadata.jav= a:474) >>> at >>> com.datastax.driver.core.TableMetadata.build(TableMetadata.java:107) >>> at >>> com.datastax.driver.core.Metadata.buildTableMetadata(Metadata.java:128) >>> at com.datastax.driver.core.Metadata.rebuildSchema(Metadata.java:89= ) >>> at >>> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnect= ion.java:259) >>> at >>> com.datastax.driver.core.ControlConnection.tryConnect(ControlConnection= .java:214) >>> at >>> com.datastax.driver.core.ControlConnection.reconnectInternal(ControlCon= nection.java:161) >>> at >>> com.datastax.driver.core.ControlConnection.connect(ControlConnection.ja= va:77) >>> at com.datastax.driver.core.Cluster$Manager.init(Cluster.java:890) >>> at >>> com.datastax.driver.core.Cluster$Manager.newSession(Cluster.java:910) >>> at >>> com.datastax.driver.core.Cluster$Manager.access$200(Cluster.java:806) >>> at com.datastax.driver.core.Cluster.connect(Cluster.java:158) >>> at >>> cassandra.CassandraTestMinimized$delayedInit$body.apply(CassandraTestMi= nimized.scala:31) >>> at scala.Function0$class.apply$mcV$sp(Function0.scala:40) >>> at >>> scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12= ) >>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>> at scala.App$$anonfun$main$1.apply(App.scala:71) >>> at scala.collection.immutable.List.foreach(List.scala:318) >>> at >>> scala.collection.generic.TraversableForwarder$class.foreach(Traversable= Forwarder.scala:32) >>> at scala.App$class.main(App.scala:71) >>> at >>> cassandra.CassandraTestMinimized$.main(CassandraTestMinimized.scala:5) >>> at >>> cassandra.CassandraTestMinimized.main(CassandraTestMinimized.scala) >>> >>> >>>> There is absolutely no fundamental reason why a CQL operation would >>>> be more than 10 times slower than it's thrift equivalent, such dramati= c >>>> difference is indicative of a bug, something obviously wrong. >>>> >>>> -- >>>> Sylvain >>>> >>> >>> >> > --047d7bacbad8d4e2ea04f2ca5307 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable
R=FCdiger,

I have = tried CQL only, and it was failing after 127 records added. =A0 I have to c= heck what is wrong.
I have the keyspace and table definiton exactly as you. =A0

I am new to scala, I do not know how to do this. =A0I ma= y try this in the evening.

Yogi

On Wed, Feb 19, 2014 at 2:50 PM, R=FCdiger Kla= ehn <rklaehn@gmail.com> wrote:
This must be something to d= o with server side validation. If I define the table like this it does not = happen:

cqlsh> CREATE KEYSPACE IF NOT EXISTS test1 WITH=A0 REPLICATION =3D {= 'class' : 'SimpleStrategy', 'replication_factor' := 1 };
cqlsh> use test1;
cqlsh:test1> create TABLE employees2 (time blob,= name blob, value blob, PRIMARY KEY (name, value)) WITH COMPACT STORAGE;

I sho= uld really update the AstClient benchmark so that it creates the table itse= lf. But as I said I am not very familiar with astyanax (or with cassandra i= n general for that matter). It seems that I just happened to hit an issue t= he the first time I was playing with it...

cheers,

R=FCdiger

p.s. Did you al= so have a large performance difference between the Thrift and the CQL?
<= /div>

On Wed, Feb 19, 2= 014 at 10:57 PM, Yogi Nerella <ynerella999@gmail.com> wr= ote:
I h= ave a two node cluster. =A0Tried with both 2.0.4 and 2.0.5. =A0=A0
I h= ave tried your code, and exactly after inserting 127 rows, the next insert = fails.

10.566482102276002 123
2.7760618708015863 124
8.936212688296054 125
9.532923906962095 126
7.5081516753554505 127
java.lang.= RuntimeException: failed to write data to C*
=A0 =A0 =A0= =A0 at demo.AstClient.insert(AstClient.java:75)
=A0 =A0 =A0 =A0 = at demo.AstClient.loadData(AstClient.java:112)
=A0 =A0 =A0 =A0 at demo.AstClient.main(AstClient.java:127)
C= aused by: com.netflix.astyanax.connectionpool.exceptions.BadRequestExceptio= n: BadRequestException: [host=3D127.0.0.1(127.0.0.1):9160, latency=3D22(22)= , attempts=3D1]InvalidRequestException(why:String didn't validate.)
=A0 =A0 =A0 =A0 at com.netflix.astyanax.thrift.ThriftConverter.ToConne= ctionPoolException(ThriftConverter.java:159)
=A0 =A0 =A0 =A0 at c= om.netflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationI= mpl.java:61)
=A0 =A0 =A0 =A0 at com.netflix.astyanax.thrift.AbstractOperationImpl.e= xecute(AbstractOperationImpl.java:28)
=A0 =A0 =A0 =A0 at com.netf= lix.astyanax.thrift.ThriftSyncConnectionFactoryImpl$ThriftConnection.execut= e(ThriftSyncConnectionFactoryImpl.java:151)
=A0 =A0 =A0 =A0 at com.netflix.astyanax.connectionpool.impl.AbstractEx= ecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:69)=
=A0 =A0 =A0 =A0 at com.netflix.astyanax.connectionpool.impl.Abst= ractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionCo= nnectionPool.java:256)
=A0 =A0 =A0 =A0 at com.netflix.astyanax.thrift.ThriftKeyspaceImpl.exec= uteOperation(ThriftKeyspaceImpl.java:478)
=A0 =A0 =A0 =A0 at com.= netflix.astyanax.thrift.ThriftKeyspaceImpl.access$000(ThriftKeyspaceImpl.ja= va:73)
=A0 =A0 =A0 =A0 at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1.ex= ecute(ThriftKeyspaceImpl.java:116)
=A0 =A0 =A0 =A0 at demo.AstCli= ent.insert(AstClient.java:72)
=A0 =A0 =A0 =A0 ... 2 more
Caused by: InvalidRequestException(why:String didn't validate.)
=A0 =A0 =A0 =A0 at org.apache.cassandra.thrift.Cassandra$batch_mutate_= result.read(Cassandra.java:20833)
=A0 =A0 =A0 =A0 at org.apache.t= hrift.TServiceClient.receiveBase(TServiceClient.java:78)
=A0 =A0 = =A0 =A0 at org.apache.cassandra.thrift.Cassandra$Client.recv_batch_mutate(C= assandra.java:964)
=A0 =A0 =A0 =A0 at org.apache.cassandra.thrift.Cassandra$Client.batch_= mutate(Cassandra.java:950)
=A0 =A0 =A0 =A0 at com.netflix.astyana= x.thrift.ThriftKeyspaceImpl$1$1.internalExecute(ThriftKeyspaceImpl.java:122= )
=A0 =A0 =A0 =A0 at com.netflix.astyanax.thrift.ThriftKeyspaceImpl$1$1.inter= nalExecute(ThriftKeyspaceImpl.java:119)
=A0 =A0 =A0 =A0 at com.ne= tflix.astyanax.thrift.AbstractOperationImpl.execute(AbstractOperationImpl.j= ava:56)
=A0 =A0 =A0 =A0 ... 10 more



On Wed,= Feb 19, 2014 at 12:38 PM, R=FCdiger Klaehn <rklaehn@gmail.com> wrote:
On = Wed, Feb 19, 2014 at 7:49 PM, Sylvain Lebresne <sylvain@datastax.com> wrote:
On Wed, Feb 19, 2014 at = 11:27 AM, R=FCdiger Klaehn <rklaehn@gmail.com> wrote:
Am I doing something wrong, or is this a fundamental limitation= of CQL.

Neither. I believe you a= re running into=A0https://issues.apache.org/jira/browse/CASSANDRA-6= 737, which is a bug, a performance bug, which we should and will fix. S= o thanks for the report. If you could give a shot to the patch on that issu= e and check if it helps, that would definitively be much appreciated.

Hi Sylvain,
Yes, this issue looks almost identical to the problem I have been experien= cing. Great to hear that you are aware of the issue and that there is a fix= .

I have cloned the cassandra repo, applied the patch, and built it. But = when I want to run the bechmark I get an exception. See below. I tried with= a non-managed dependency to cassandra-driver-core-2.0.0-rc3-SNAPSHOT-jar-w= ith-dependencies.jar, which I compiled from source because I read that that= might help. But that did not make a difference.

So currently I don't know how to give the patch a try. A= ny ideas?

cheers,

R=FCdiger

Exception in thread "main" java.lang.IllegalArgumentException= : replicate_on_write is not a column defined in this metadata
=A0=A0=A0 = at com.datastax.driver.core.ColumnDefinitions.getAllIdx(ColumnDefinitions.j= ava:273)
=A0=A0=A0 at com.datastax.driver.core.ColumnDefinitions.getFirstIdx(ColumnD= efinitions.java:279)
=A0=A0=A0 at com.datastax.driver.core.Row.getBool(R= ow.java:117)
=A0=A0=A0 at com.datastax.driver.core.TableMetadata$Options= .<init>(TableMetadata.java:474)
=A0=A0=A0 at com.datastax.driver.core.TableMetadata.build(TableMetadata.jav= a:107)
=A0=A0=A0 at com.datastax.driver.core.Metadata.buildTableMetadata= (Metadata.java:128)
=A0=A0=A0 at com.datastax.driver.core.Metadata.rebui= ldSchema(Metadata.java:89)
=A0=A0=A0 at com.datastax.driver.core.ControlConnection.refreshSchema(Contr= olConnection.java:259)
=A0=A0=A0 at com.datastax.driver.core.ControlConn= ection.tryConnect(ControlConnection.java:214)
=A0=A0=A0 at com.datastax.= driver.core.ControlConnection.reconnectInternal(ControlConnection.java:161)=
=A0=A0=A0 at com.datastax.driver.core.ControlConnection.connect(ControlConn= ection.java:77)
=A0=A0=A0 at com.datastax.driver.core.Cluster$Manager.in= it(Cluster.java:890)
=A0=A0=A0 at com.datastax.driver.core.Cluster$Manag= er.newSession(Cluster.java:910)
=A0=A0=A0 at com.datastax.driver.core.Cluster$Manager.access$200(Cluster.ja= va:806)
=A0=A0=A0 at com.datastax.driver.core.Cluster.connect(Cluster.ja= va:158)
=A0=A0=A0 at cassandra.CassandraTestMinimized$delayedInit$body.a= pply(CassandraTestMinimized.scala:31)
=A0=A0=A0 at scala.Function0$class.apply$mcV$sp(Function0.scala:40)
=A0= =A0=A0 at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.sc= ala:12)
=A0=A0=A0 at scala.App$$anonfun$main$1.apply(App.scala:71)
= =A0=A0=A0 at scala.App$$anonfun$main$1.apply(App.scala:71)
=A0=A0=A0 at scala.collection.immutable.List.foreach(List.scala:318)
=A0= =A0=A0 at scala.collection.generic.TraversableForwarder$class.foreach(Trave= rsableForwarder.scala:32)
=A0=A0=A0 at scala.App$class.main(App.scala:71= )
=A0=A0=A0 at cassandra.CassandraTestMinimized$.main(CassandraTestMinim= ized.scala:5)
=A0=A0=A0 at cassandra.CassandraTestMinimized.main(CassandraTestMinimized.s= cala)
=A0
There= is absolutely no fundamental reason why a CQL operation would be more than= 10 times slower than it's thrift equivalent, such dramatic difference = is indicative of a bug, something obviously wrong.

--
Sylvain




--047d7bacbad8d4e2ea04f2ca5307--