Return-Path: X-Original-To: apmail-cassandra-user-archive@www.apache.org Delivered-To: apmail-cassandra-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C07FE10EAC for ; Mon, 8 Dec 2014 11:59:44 +0000 (UTC) Received: (qmail 34280 invoked by uid 500); 8 Dec 2014 11:59:41 -0000 Delivered-To: apmail-cassandra-user-archive@cassandra.apache.org Received: (qmail 34243 invoked by uid 500); 8 Dec 2014 11:59:41 -0000 Mailing-List: contact user-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@cassandra.apache.org Delivered-To: mailing list user@cassandra.apache.org Received: (qmail 34233 invoked by uid 99); 8 Dec 2014 11:59:41 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2014 11:59:41 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: 80.237.132.58 is neither permitted nor denied by domain of snazy@snazy.de) Received: from [80.237.132.58] (HELO wp051.webpack.hosteurope.de) (80.237.132.58) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 08 Dec 2014 11:59:36 +0000 Received: from 168-98.traveltainment.de ([80.87.168.98] helo=macbookretina15.traveltainment.int); authenticated by wp051.webpack.hosteurope.de running ExIM with esmtpsa (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) id 1XxwyB-0005V6-D2; Mon, 08 Dec 2014 12:59:15 +0100 From: Robert Stupp Content-Type: multipart/alternative; boundary="Apple-Mail=_D3A0738F-484C-4902-A24D-AF30B5C9E3DD" Message-Id: Mime-Version: 1.0 (Mac OS X Mail 8.1 \(1993\)) Subject: Re: Could ring cache really improve performance in Cassandra? Date: Mon, 8 Dec 2014 12:59:13 +0100 References: <00ae01d011fc$19d0f5e0$4d72e1a0$@gmail.com> <72B1637B-03E4-497C-9CC0-306B723C39F5@snazy.de> To: user@cassandra.apache.org In-Reply-To: X-Mailer: Apple Mail (2.1993) X-bounce-key: webpack.hosteurope.de;snazy@snazy.de;1418039976;a5d1b97e; X-Virus-Checked: Checked by ClamAV on apache.org --Apple-Mail=_D3A0738F-484C-4902-A24D-AF30B5C9E3DD Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=utf-8 >=20 > So the native protocol is an asynchronous protocol?=20 Yes. > I have tried using the stress test tool. But it seems that this tool = should run on the same node as one of the Cassandra node(or at least on = a node having Cassandra installed)? One I try to run this tool on a = separate client instance, I got exceptions thrown. You should start with =E2=80=9Enew=E2=80=9C kind of stress testing = (using CQL3, using native protocol, using prepared statements). Forget = about thrift ;) Start with the example YAML stress file first to learn about it. It = allows you to configure simultaneous writes and reads that match your = workload. And you do not need to run it on a C* node - but you should think about = the network between the stress test tool and your cluster. > The ringcache I found is = here:https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/ca= ssandra/client/RingCache.java = . And I try to implement the similar = funcionality in C++. My repo is here: = https://github.com/kongjialin/Cassandra = . My idea is that all the = requests go to the client-side "ring cache" and be sent to the target = Cassandra node(each node is associated with a client pool) to avoid = routing between nodes in the cluster. You can safe yourself a lot of work to implement it "right=E2=80=9C - = just use the C++ driver. It knows about the native protocol and routes = requests to the correct nodes. Although you can go into the C++ driver = code and look how it works, improve it etc. :) I don=E2=80=99t know anything about the C++ driver - but feel free to = post to the driver mailing list and/or the #datastax-drivers IRC = channel. >=20 > 2014-12-08 16:42 GMT+08:00 Robert Stupp >: > cassandra-stress is a great tool to check whether the sizing of your = cluster in combination of your data model will fit your production = needs. I.e. without the application :) Removing the application removes = any possible bugs from the load test. Sure, it=E2=80=99s a necessary = step to do it with your application - but I=E2=80=99d recommend to start = with the stress test tool first. >=20 > Thrift is a deprecated API. I strongly recommend to use the C++ driver = (I pretty sure it supports the native protocol). The native protocol = achieves approx. twice the performance than thrift via much fewer TCP = connections. (Thrift is RPC - means connections usually waste system, = application and server resources while waiting for something. Native = protocol is a multiplexed protocol.) As John already said, all = development effort is spent on CQL3 and native protocol - thift is just = "supported". >=20 > With CQL you can you everything that you can do with thrift + more, = new stuff. >=20 > I also recommend to use prepared statements (it automagically works in = a distributed cluster with the native protocol) - it eliminates the = effort to parse CQL statement again and again. >=20 >=20 >> Am 08.12.2014 um 09:26 schrieb =E5=AD=94=E5=98=89=E6=9E=97 = >: >>=20 >> Thanks Jonathan, actually I'm wondering how CQL is implemented = underlying, a different RPC mechanism? Why it is faster than thrift? I = know I'm wrong, but now I just regard CQL as a query language. Could you = please help explain to me? I still feel puzzled after reading some docs = about CQL. I create table in CQL, and use cql3 API in thrift. I don't = know what else I can do with CQL. And I am using C++ to write the client = side code. Currently I am not using the C++ driver and want to write = some simple functionality by myself.=20 >>=20 >> Also, I didn't use the stress test tool provided in the Cassandra = distribution because I also want to make sure whether I can achieve good = performance as excepted using my client code. I know others have = benchmarked Cassandra and got good results. But if I cannot reproduce = the satisfactory results, I cannot use it in my case. >>=20 >> I will create a repo and send a link later, hope to get your kind = help. >>=20 >> Thanks very much. >>=20 >> 2014-12-08 14:28 GMT+08:00 Jonathan Haddad >: >> I would really not recommend using thrift for anything at this point, = including your load tests. Take a look at CQL, all development is going = there and has in 2.1 seen a massive performance boost over 2.0. >>=20 >> You may want to try the Cassandra stress tool included in 2.1, it can = stress a table you've already built. That way you can rule out any bugs = on the client side. If you're going to keep using your tool, however, = it would be helpful if you sent out a link to the repo, since currently = we have no way of knowing if you've got a client side bug (data model or = code) that's limiting your performance. >>=20 >>=20 >> On Sun Dec 07 2014 at 7:55:16 PM =E5=AD=94=E5=98=89=E6=9E=97 = > wrote: >> I find under the src/client folder of Cassandra 2.1.0 source code, = there is a RingCache.java file. It uses a thrift client calling the = describe_ring() API to get the token range of each Cassandra node. It is = used on the client side. The client can use it combined with the = partitioner to get the target node. In this way there is no need to = route requests between Cassandra nodes, and the client can directly = connect to the target node. So maybe it can save some routing time and = improve performance. >> Thank you very much. >>=20 >> 2014-12-08 1:28 GMT+08:00 Jonathan Haddad >: >> What's a ring cache? >>=20 >> FYI if you're using the DataStax CQL drivers they will automatically = route requests to the correct node. >>=20 >> On Sun Dec 07 2014 at 12:59:36 AM kong > wrote: >> Hi, >>=20 >> I'm doing stress test on Cassandra. And I learn that using ring cache = can improve the performance because the client requests can directly go = to the target Cassandra server and the coordinator Cassandra node is the = desired target node. In this way, there is no need for coordinator node = to route the client requests to the target node, and maybe we can get = the linear performance increment. >>=20 >> =20 >>=20 >> However, in my stress test on an Amazon EC2 cluster, the test results = are weird. Seems that there's no performance improvement after using = ring cache. Could anyone help me explain this results? (Also, I think = the results of test without ring cache is weird, because there's no = linear increment on QPS when new nodes are added. I need help on = explaining this, too). The results are as follows: >>=20 >> =20 >>=20 >> INSERT(write): >>=20 >> Node count >>=20 >> Replication factor >>=20 >> QPS(No ring cache) >>=20 >> QPS(ring cache) >>=20 >> 1 >>=20 >> 1 >>=20 >> 18687 >>=20 >> 20195 >>=20 >> 2 >>=20 >> 1 >>=20 >> 20793 >>=20 >> 26403 >>=20 >> 2 >>=20 >> 2 >>=20 >> 22498 >>=20 >> 21263 >>=20 >> 4 >>=20 >> 1 >>=20 >> 28348 >>=20 >> 30010 >>=20 >> 4 >>=20 >> 3 >>=20 >> 28631 >>=20 >> 24413 >>=20 >> =20 >>=20 >> SELECT(read): >>=20 >> Node count >>=20 >> Replication factor >>=20 >> QPS(No ring cache) >>=20 >> QPS(ring cache) >>=20 >> 1 >>=20 >> 1 >>=20 >> 24498 >>=20 >> 22802 >>=20 >> 2 >>=20 >> 1 >>=20 >> 28219 >>=20 >> 27030 >>=20 >> 2 >>=20 >> 2 >>=20 >> 35383 >>=20 >> 36674 >>=20 >> 4 >>=20 >> 1 >>=20 >> 34648 >>=20 >> 28347 >>=20 >> 4 >>=20 >> 3 >>=20 >> 52932 >>=20 >> 52590 >>=20 >> =20 >>=20 >> =20 >>=20 >> Thank you very much, >>=20 >> Joy >>=20 >>=20 >>=20 >=20 >=20 --Apple-Mail=_D3A0738F-484C-4902-A24D-AF30B5C9E3DD Content-Transfer-Encoding: quoted-printable Content-Type: text/html; charset=utf-8

So the native protocol is an asynchronous = protocol? 
Yes.

I have tried using the stress = test tool. But it seems that this tool should run on the same node as = one of the Cassandra node(or at least on a node having Cassandra = installed)? One I try to run this tool on a separate client instance, I = got exceptions thrown.
You should = start with =E2=80=9Enew=E2=80=9C kind of stress testing (using CQL3, = using native protocol, using prepared statements). Forget about thrift = ;)
Start with the example YAML stress file first to learn = about it. It allows you to configure simultaneous writes and reads that = match your workload.
And you do not need to run it on a C* = node - but you should think about the network between the stress test = tool and your cluster.

The= ringcache I found is here:https://github.com/apache/cassandra/blob/trunk/src/java/org/apa= che/cassandra/client/RingCache.java . And I try to implement = the similar funcionality in C++. My repo is here: https://github.com/kongjialin/Cassandra . My idea is = that all the requests go to the client-side "ring cache" and be sent to = the target Cassandra node(each node is associated with a client pool) to = avoid routing between nodes in the = cluster.
You can safe yourself a lot = of work to implement it "right=E2=80=9C - just use the C++ driver. It = knows about the native protocol and routes requests to the correct = nodes. Although you can go into the C++ driver code and look how it = works, improve it etc. :)
I don=E2=80=99t know anything about = the C++ driver - but feel free to post to the driver mailing list and/or = the #datastax-drivers IRC channel.



2014-12-08 16:42 = GMT+08:00 Robert Stupp <snazy@snazy.de>:
cassandra-stress is a great tool to check = whether the sizing of your cluster in combination of your data model = will fit your production needs. I.e. without the application :) Removing = the application removes any possible bugs from the load test. Sure, = it=E2=80=99s a necessary step to do it with your application - but I=E2=80= =99d recommend to start with the stress test tool first.

Thrift is a deprecated = API. I strongly recommend to use the C++ driver (I pretty sure it = supports the native protocol). The native protocol achieves approx. = twice the performance than thrift via much fewer TCP connections. = (Thrift is RPC - means connections usually waste system, application and = server resources while waiting for something. Native protocol is a = multiplexed protocol.) As John already said, all development effort is = spent on CQL3 and native protocol - thift is just "supported".

With CQL you can you = everything that you can do with thrift + more, new stuff.

I also recommend to use = prepared statements (it automagically works in a distributed cluster = with the native protocol) - it eliminates the effort to parse CQL = statement again and again.


Am 08.12.2014 um 09:26 schrieb =E5=AD=94=E5=98=89= =E6=9E=97 <kongjialin92@gmail.com>:

Thanks Jonathan, actually I'm wondering how CQL is = implemented underlying, a different RPC mechanism? Why it is faster than = thrift? I know I'm wrong, but now I just regard CQL as a query language. = Could you please help explain to me? I still feel puzzled after reading = some docs about CQL. I create table in CQL, and use cql3 API in thrift. = I don't know what else I can do with CQL. And I am using C++ to write = the client side code. Currently I am not using the C++ driver and want = to write some simple functionality by myself. 

Also, I didn't use the stress test tool = provided in the Cassandra distribution because I also want to make sure = whether I can achieve good performance as excepted using my client code. = I know others have benchmarked Cassandra and got good results. But if I = cannot reproduce the satisfactory results, I cannot use it in my = case.

I will = create a repo and send a link later, hope to get your kind = help.

Thanks = very much.

2014-12-08 14:28 GMT+08:00 Jonathan Haddad <jon@jonhaddad.com>:
I would really not = recommend using thrift for anything at this point, including your load = tests.  Take a look at CQL, all development is going there and has = in 2.1 seen a massive performance boost over 2.0.

You may want to try the = Cassandra stress tool included in 2.1, it can stress a table you've = already built.  That way you can rule out any bugs on the client = side.  If you're going to keep using your tool, however, it would = be helpful if you sent out a link to the repo, since currently we have = no way of knowing if you've got a client side bug (data model or code) = that's limiting your performance.


On Sun Dec 07 2014 at 7:55:16 PM =E5=AD=94=E5=98=89=E6= =9E=97 <kongjialin92@gmail.com> wrote:
I find under the = src/client folder of Cassandra 2.1.0 source code, there is a RingCache.java file. It uses a thrift client calling = the describe_ring() API to get the token range of each = Cassandra node. It is used on the client side. The client can use it = combined with the partitioner to get the target node. In this way there = is no need to route requests between Cassandra nodes, and the client can = directly connect to the target node. So maybe it can save some routing = time and improve performance.
Thank you = very much.

2014-12-08 1:28 GMT+08:00 Jonathan Haddad <jon@jonhaddad.com>:
What's a ring = cache?

FYI if you're using = the DataStax CQL drivers = they will automatically route requests to the correct = node.

On Sun Dec 07 2014 at 12:59:36 AM kong <kongjialin92@gmail.com> wrote:

Hi,

I'm doing stress test on Cassandra. And I learn that using = ring cache can improve the performance because the client requests can = directly go to the target Cassandra server and the coordinator Cassandra = node is the desired target node. In this way, there is no need for = coordinator node to route the client requests to the target node, and = maybe we can get the linear performance increment.

 

However, in my = stress test on an Amazon EC2 cluster, the test results are weird. Seems = that there's no performance improvement after using ring cache. Could = anyone help me explain this results? (Also, I think the results of test = without ring cache is weird, because there's no linear increment on QPS = when new nodes are added. I need help on explaining this, too). The = results are as follows:

 

INSERT(write):

Node count

Replication factor

QPS(No = ring cache)

QPS(ring cache)

1

1

18687

20195

2

1

20793

26403

2

2

22498

21263

4

1

28348

30010

4

3

28631

24413

 

SELECT(read):

Node count

Replication factor

QPS(No = ring cache)

QPS(ring cache)

1

1

24498

22802

2

1

28219

27030

2

2

35383

36674

4

1

34648

28347

4

3

52932

52590

 

 

Thank you very = much,

Joy






= --Apple-Mail=_D3A0738F-484C-4902-A24D-AF30B5C9E3DD--