cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ben Bromhead <...@instaclustr.com>
Subject Re: High latencies for simple queries
Date Fri, 27 Mar 2015 23:13:40 GMT
One other thing to keep in mind / check is that doing these tests locally
the cassandra driver will connect using the network stack, whereas postgres
supports local connections over a unix domain socket (this is also enabled
by default).

Unix domain sockets are significantly faster than tcp as you don't have a
network stack to traverse. I think any driver using libpq will attempt to
use the domain socket when connecting locally.

But I'm going to hazard a guess something else is going on with the
Cassandra connection as I'm able to get 0.5ms queries locally and that's
even with trace turned on.

Ben

On 27 March 2015 at 14:10, Laing, Michael <michael.laing@nytimes.com> wrote:

> Actually I am in the middle of setting up the same sort of thing for
> PostgreSQL using psycopg2 and pyev.
>
> I'll be using Cassandra and PostgreSQL in an IoT experiment as the backend
> for swarms of MQTT brokers at something in the 10-100M client range.
>
> ml
>
> On Fri, Mar 27, 2015 at 4:59 PM, Laing, Michael <michael.laing@nytimes.com
> > wrote:
>
>> I use callback chaining with the python driver and can confirm that it is
>> very fast.
>>
>> You can "chain the chains" together to perform sequential processing. I
>> do this when retrieving "metadata" and then the referenced "payload" for
>> example, when the metadata has been inverted and the payload is larger than
>> we want to invert. And you can be running multiple "chains of chains"
>> asynchronously - cascade state by employing the userdata of the future.
>>
>> We also multiprocess, for more parallelism, and we distribute work to
>> multiple multiprocessing instances using a message broker for yet more
>> parallel activity, as well as reliability.
>>
>> ml
>>
>> On Fri, Mar 27, 2015 at 4:28 PM, Tyler Hobbs <tyler@datastax.com> wrote:
>>
>>> Since you're executing queries sequentially, you may want to look into
>>> using callback chaining to avoid the cross-thread signaling that results in
>>> the 1ms latencies.  Basically, just use session.execute_async() and attach
>>> a callback to the returned future that will execute your next query.  The
>>> callback is executed on the event loop thread.  The main downsides to this
>>> are that you need to be careful to avoid blocking the event loop thread
>>> (including executing session.execute() or prepare()) and you need to ensure
>>> that all exceptions raised in the callback are handled by your application
>>> code.
>>>
>>> On Fri, Mar 27, 2015 at 3:11 PM, Artur Siekielski <artc@vhex.net> wrote:
>>>
>>>> I think that in your example Postgres spends most time on waiting for
>>>> fsync() to complete. On Linux, for a battery-backed raid controller, it's
>>>> safe to mount ext4 filesystem with "barrier=0" option which improves
>>>> fsync() performance a lot. I have partitions mounted with this option and
I
>>>> did a test from Python, using psycopg2 driver, and I got the following
>>>> latencies, in milliseconds:
>>>> - INSERT without COMMIT: 0.04
>>>> - INSERT with COMMIT: 0.12
>>>> - SELECT: 0.05
>>>> I'm also repeating benchmark runs multiple times (I'm using Python's
>>>> "timeit" module).
>>>>
>>>>
>>>> On 03/27/2015 07:58 PM, Ben Bromhead wrote:
>>>>
>>>>> Latency can be so variable even when testing things locally. I quickly
>>>>> fired up postgres and did the following with psql:
>>>>>
>>>>> ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i));
>>>>> CREATE TABLE
>>>>> ben=# \timing
>>>>> Timing is on.
>>>>> ben=# INSERT INTO foo VALUES(2, 'yay');
>>>>> INSERT 0 1
>>>>> Time: 1.162 ms
>>>>> ben=# INSERT INTO foo VALUES(3, 'yay');
>>>>> INSERT 0 1
>>>>> Time: 1.108 ms
>>>>>
>>>>> I then fired up a local copy of Cassandra (2.0.12)
>>>>>
>>>>> cqlsh> CREATE KEYSPACE foo WITH replication = { 'class' :
>>>>> 'SimpleStrategy', 'replication_factor' : 1 };
>>>>> cqlsh> USE foo;
>>>>> cqlsh:foo> CREATE TABLE foo(i int PRIMARY KEY, j text);
>>>>> cqlsh:foo> TRACING ON;
>>>>> Now tracing requests.
>>>>> cqlsh:foo> INSERT INTO foo (i, j) VALUES (1, 'yay');
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Tyler Hobbs
>>> DataStax <http://datastax.com/>
>>>
>>
>>
>


-- 

Ben Bromhead

Instaclustr | www.instaclustr.com | @instaclustr
<http://twitter.com/instaclustr> | (650) 284 9692

Mime
View raw message