incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tyler Hobbs <ty...@datastax.com>
Subject Re: Recommended data size for Reads/Writes in Cassandra
Date Thu, 18 Jul 2013 23:26:01 GMT
Large writes can sometimes put a lot of heap/GC pressure on the node, which
can be an additional source of latency.  Use the query tracing in Cassandra
1.2+ to get a better picture of where the latency is.


On Thu, Jul 18, 2013 at 6:18 PM, Mohammad Hajjat <hajjat@purdue.edu> wrote:

> Thanks Andrey and Tyler! That was useful :)
>
> Do you guys have any idea why the 10 MB writes took a lot of time in my
> case although I'm using Large VMs which have plenty of resources? Or do you
> think this latency is expected?
> I'm trying to see how much time is spent in the network versus processing
> CPU cycles of the nodes; any suggestion for a good profiling tool?
>
>
>
> On Thu, Jul 18, 2013 at 5:50 PM, Tyler Hobbs <tyler@datastax.com> wrote:
>
>> The default limit is 16mb, but realistically you should try to keep
>> writes under 10mb, breaking up large values into multiple columns/rows if
>> necessary.
>>
>>
>> On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh <ailinykh@gmail.com>wrote:
>>
>>> there is a limit of thrift message ( thrift_max_message_length_in_mb),
>>> by default it is 64m if I'm not mistaken. This is your limit.
>>>
>>>
>>> On Thu, Jul 18, 2013 at 2:03 PM, hajjat <hajjat@purdue.edu> wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there a recommended data size for Reads/Writes in Cassandra? I tried
>>>> inserting 10 MB objects and the latency I got was pretty high. Also, I
>>>> was
>>>> never able to insert larger objects (say 50 MB) since Cassandra kept
>>>> crashing when I tried that.
>>>>
>>>> Here is my experiment setup:
>>>> I used two Large VMs in EC2 within the same data-center. Inserts have
>>>> ALL
>>>> consistency (strong consistency).  The latencies were as follows:
>>>> Data size:      10 MB           1 MB            100 Bytes
>>>> Latency:        250ms           50ms            8ms
>>>>
>>>> I've also done the same for two Large VMs across two data-centers. The
>>>> latencies were around:
>>>> Data size:      10 MB           1 MB            100 Bytes
>>>> Latency:        1200ms          800ms   80ms
>>>>
>>>> 1) Ain't the 10 MB latency extremely high?
>>>> 2) Is there a recommended data size to use with Cassandra (e.g., a few
>>>> bytes
>>>> up to 1 MB)?
>>>> 3) Also, I tried inserting 50 MB data but Cassandra kept crashing. Does
>>>> anybody know why? I thought the max data size should be up to 2 GB?
>>>>
>>>> Thanks,
>>>> Mohammad
>>>>
>>>> PS. Here is my python code I use to insert into Cassandra. I put my
>>>> stopwatch timers around the insert statement:
>>>>     fh = open(TEST_FILE,'r')
>>>>     data = str(fh.read())
>>>>
>>>>     POOL = ConnectionPool(keyspace, server_list=['localhost:9160'],
>>>> timeout=None)
>>>>     USER = ColumnFamily(POOL, 'User')
>>>>     USER.insert('Ali', {'data':
>>>>
>>>> data},write_consistency_level=pycassa.cassandra.ttypes.ConsistencyLevel.ALL)
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Recommended-data-size-for-Reads-Writes-in-Cassandra-tp7589141.html
>>>> Sent from the cassandra-user@incubator.apache.org mailing list archive
>>>> at Nabble.com.
>>>>
>>>
>>>
>>
>>
>> --
>> Tyler Hobbs
>> DataStax <http://datastax.com/>
>>
>
>
>
> --
> *Mohammad Hajjat*
> *Ph.D. Student*
> *Electrical and Computer Engineering*
> *Purdue University*
>



-- 
Tyler Hobbs
DataStax <http://datastax.com/>

Mime
View raw message