cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Evans (Commented) (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-3634) compare string vs. binary prepared statement parameters
Date Thu, 12 Jan 2012 02:43:39 GMT


Eric Evans commented on CASSANDRA-3634:

My reasoning is, there aren't a whole lot of places left to pick up an extra 10% performance...
Two years ago, or one, maybe 10% isn't such a big deal since there's so much left to optimize.
That's no longer the case; I don't think we should knowingly lock our next-gen interface into
a lower-performing design. Once made, we're stuck with this decision, or at least with a really,
really high barrier to change it.

I think a custom protocol (planned for reasons unrelated to performance) could easily be worth
10%.  I take your point though, there isn't a lot of low hanging fruit left.

On the other hand, we have the downside of extra complexity for the driver authors. While
this is a valid point, it's a finite one – once a prepared statement api has been created
and debugged, binary vs strings isn't going to matter. It's a one-time fee in exchange for
better performance forever. Additionally, sample binary marshalling code already exists for
any language with a Thrift driver. So we're really talking about a relatively small amount
of work to build a binary-based PS api, over a String one.

I'm probably a little less optimistic about the amount of work or the potential for bugs.
 A Pycassa bug that comes to mind caused integers to be mis-encoded for more than a year before
it was caught and fixed (and this being one of our most (_the_ most?) battle-tested libraries).

That said, I do understand all of your points.

Considering the _kind_ of trade-off we're talking about, I wanted this issue to be thoroughly
thought through/discussed, with any relevant data readily at hand.  The scale is obviously
quite different (I'm not citing a full swing of the pendulum here), but the arguments for/against
are basically the same ones that spawned CQL in the first place.  And, as you said, changing
later is prohibitively difficult; We're going to have to live with this decision.

I posted to client-dev@ earlier (I don't know why I didn't think of that a week ago).  They're
basically our front-line users in this regard, and I think it would be interesting to hear
from some of them (particularly if I'm carrying a mantle none of them care about :)).
> compare string vs. binary prepared statement parameters
> -------------------------------------------------------
>                 Key: CASSANDRA-3634
>                 URL:
>             Project: Cassandra
>          Issue Type: Sub-task
>          Components: API, Core
>            Reporter: Eric Evans
>            Assignee: Eric Evans
>            Priority: Minor
>              Labels: cql
>             Fix For: 1.1
> Perform benchmarks to compare the performance of string and pre-serialized binary parameters
to prepared statements.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message