cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ryan Svihla <>
Subject Re: Drivers performance
Date Fri, 19 Dec 2014 14:17:38 GMT
Better question for the java driver mailing list, but I see a number of
problems in your Datastax java driver code, and without knowing the way
Astyanax handles caching of prepared statements I can tell you

   1. You're re repreparing a statement on _every_ iteration, and these are
   not cached by the driver. This is not only expensive, it is slower than
   just using non prepared statements. This is a substantial slow down.
   Drivers are not necessarily implementing this the same way so the code is
   not apples to apples. Change your code to prepare _once_ and I bet your
   numbers improve drastically.
   2. Your pooling options are CRAZY high, and I'm guessing your'e running
   out of resources on the datastax driver, again the code is different with
   different tradeoffs from Astyanax , a connection in thrift is not remotely
   the same as a connection in the modern remote protocol. Just use the
   default pooling options and I bet your numbers improve greatly (if not
   there is something deeply off about your cluster and or app servers).
   3. A lot of the speed up in the java driver is in the async support and
   how the native protocol handles async, since you're doing synchronous this
   is the best case for thrift performance, however that still does not
   explain your gap ( which in most synchronous cases is thrift is comparable
   at best, but usually not faster ).
   4. I haven't been able to figure out which version of the Datastax
   driver your on from looking at the code, this can change performance
   drastically as there has been many improvements, especially for Cassandra

I suggest you reply to the java driver mailing list for more in depth

On Fri, Dec 19, 2014 at 7:26 AM, Svec, Michal <> wrote:

>  Hello,
> I am in the middle of evaluating whether we should switch from Astyanax to
> datastax driver and I did simple benchmark that load 10 000 times the same
> row by key and I was surprised with the slowness of datastax driver. I
> uploaded it to github.
> It was tested against Cassandra 1.2 and 2.1. Testing conditions were naive
> (localhost, single node, …) but still the difference is huge.
> 10 000 iterations:
> ·         Astyanax:2734 ms
> ·         Astyanax prepared:1997 ms
> ·         Datastax:10230 ms
> Is it really so slow or do I miss something?
> Thank you for any advice.
> Michal
>  NOTICE: This email and any attachments may contain confidential and
> proprietary information of NetSuite Inc. and is for the sole use of the
> intended recipient for the stated purpose. Any improper use or distribution
> is prohibited. If you are not the intended recipient, please notify the
> sender; do not review, copy or distribute; and promptly delete or destroy
> all transmitted information. Please note that all communications and
> information transmitted through this email system may be monitored and
> retained by NetSuite or its agents and that all incoming email is
> automatically scanned by a third party spam and filtering service which may
> result in deletion of a legitimate e-mail before it is read by the intended
> recipient.


[image: datastax_logo.png] <>

Ryan Svihla

Solution Architect

[image: twitter.png] <> [image: linkedin.png]

DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

View raw message