incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Derek Williams <de...@fyrie.net>
Subject Re: hector or astyanax
Date Mon, 06 May 2013 19:03:42 GMT
Also have to keep in mind that it should be rare to only use a single
socket since you are usually making at least 1 connection per node in the
cluster (or local datacenter). There is also nothing enforcing that a
single client cannot open more than 1 connection to a node. In the end it
should come down to which protocol implementation is faster.


On Mon, May 6, 2013 at 11:58 AM, Aaron Turner <synfinatic@gmail.com> wrote:

> From my experience, your NIC buffers generally aren't the problem (or at
> least it's easy to tune them to fix).  It's TCP.  Simply put, your raw NIC
> throughput > single TCP socket throughput on most modern hardware/OS
> combinations.  This is especially true as latency increases between the two
> hosts.  This is why Bittorrent or "download accellerators" are often faster
> then just downloading a large file via your browser or ftp client- they're
> running multiple TCP connections in parallel compared to only one.
>
> TCP is great for reliable, bi-directional, stream based communication.
>  Not the best solution for high throughput though.  UDP is much better for
> that, but then you loose all the features that TCP gives you and so then
> people end up re-inventing the wheel (poorly I might add).
>
> So yeah, I think the answer to the question of "which is faster" the
> answer is "it depends on your queries".
>
>
>
> On Mon, May 6, 2013 at 10:24 AM, Hiller, Dean <Dean.Hiller@nrel.gov>wrote:
>
>> You have me thinking more.  I wonder in practice if 3 sockets is any
>> faster than 1 socket when doing nio.  If your buffer sizes were small,
>> maybe that would be the case.  Usually the nic buffers are big so when the
>> selector fires it is reading from 3 buffers for 3 sockets or 1 buffer for
>> one socket.  In both cases, all 3 requests are there in the buffers.  At
>> any rate, my belief is it probably is still basically parallel performance
>> on one socket though I have not tested my theory…..My theory being the real
>> bottleneck on performance being the work cassandra has to do on the reads
>> and such.
>>
>> What about 20 sockets then(like someone has a pool).  Will it be any
>> faster…not really sure as in the end you are still held up by the real
>> bottleneck of reading from disk on the cassandra side.  We went to 20
>> threads in one case using 20 sockets with astyanax and received no
>> performance improvement(synchronous but more sockets did not improve our
>> performance).  Ie. It may be the case 90% of the time, one socket is just
>> as fast as 10/20…..I would love to know the truth/answer to that though.
>>
>> Later,
>> Dean
>>
>>
>> From: Aaron Turner <synfinatic@gmail.com<mailto:synfinatic@gmail.com>>
>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org>" <
>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>
>> Date: Monday, May 6, 2013 10:57 AM
>> To: cassandra users <user@cassandra.apache.org<mailto:
>> user@cassandra.apache.org>>
>> Subject: Re: hector or astyanax
>>
>> Just because you can batch queries or have the server process them out of
>> order doesn't make it fully "parellel".  You're still using a single TCP
>> connection which is by definition a serial data stream.  Basically, if you
>> send a bunch of queries which each return a large amount of data you've
>> effectively limited your query throughput to a single TCP connection.
>>  Using Thrift, each query result is returned in it's own TCP stream in
>> *parallel*.
>>
>> Not saying the new API isn't great, doesn't have it's place or may have
>> better performance in certain situations, but generally speaking I would
>> refrain from making general claims without actual benchmarks to back them
>> up.   I do completely agree that Async interfaces have their place and have
>> certain advantages over multi-threading models, but it's just another tool
>> to be used when appropriate.
>>
>> Just my .02. :)
>>
>>
>>
>> On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean <Dean.Hiller@nrel.gov
>> <mailto:Dean.Hiller@nrel.gov>> wrote:
>> I was under the impression that it is multiple requests using a single
>> connectin PARALLEL not serial as they have request ids and the responses do
>> as well so you can send a request while a previous request has no response
>> just yet.
>>
>> I think you do get a big speed advantage from the asynchronous nature as
>> you do not need to hold up so many threads in your webserver while you have
>> outstanding requests being processed.  The thrift async was not exactly
>> async like I am suspecting the new java driver is, but have not verified(I
>> hope it is)
>>
>> Dean
>>
>> From: Aaron Turner <synfinatic@gmail.com<mailto:synfinatic@gmail.com
>> ><mailto:synfinatic@gmail.com<mailto:synfinatic@gmail.com>>>
>> Reply-To: "user@cassandra.apache.org<mailto:user@cassandra.apache.org
>> ><mailto:user@cassandra.apache.org<mailto:user@cassandra.apache.org>>"
<
>> user@cassandra.apache.org<mailto:user@cassandra.apache.org><mailto:
>> user@cassandra.apache.org<mailto:user@cassandra.apache.org>>>
>> Date: Sunday, May 5, 2013 5:27 PM
>> To: cassandra users <user@cassandra.apache.org<mailto:
>> user@cassandra.apache.org><mailto:user@cassandra.apache.org<mailto:
>> user@cassandra.apache.org>>>
>> Subject: Re: hector or astyanax
>>
>>
>>
>> On Sun, May 5, 2013 at 1:09 PM, Derek Williams <derek@fyrie.net<mailto:
>> derek@fyrie.net><mailto:derek@fyrie.net<mailto:derek@fyrie.net>>>
wrote:
>> The binary protocol is able to multiplex multiple requests using a single
>> connection, which can lead to much better performance (similar to HTTP vs
>> SPDY). This is without comparing the performance of thrift vs binary
>> protocol, which I assume the binary protocol would be faster since it is
>> specialized for cassandra requests.
>>
>>
>> Curious why you think multiplexing multiple requests over a single
>> connection (serial) is faster then multiple requests over multiple
>> connections (parallel)?
>>
>> And isn't Thrift a binary protocol?
>>
>>
>> --
>> Aaron Turner
>> http://synfin.net/         Twitter: @synfinatic
>> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
>> Windows
>> Those who would give up essential Liberty, to purchase a little temporary
>> Safety, deserve neither Liberty nor Safety.
>>     -- Benjamin Franklin
>> "carpe diem quam minimum credula postero"
>>
>>
>>
>> --
>> Aaron Turner
>> http://synfin.net/         Twitter: @synfinatic
>> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
>> Windows
>> Those who would give up essential Liberty, to purchase a little temporary
>> Safety, deserve neither Liberty nor Safety.
>>     -- Benjamin Franklin
>> "carpe diem quam minimum credula postero"
>>
>
>
>
> --
> Aaron Turner
> http://synfin.net/         Twitter: @synfinatic
> http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix &
> Windows
> Those who would give up essential Liberty, to purchase a little temporary
> Safety, deserve neither Liberty nor Safety.
>     -- Benjamin Franklin
> "carpe diem quam minimum credula postero"
>



-- 
Derek Williams

Mime
View raw message