Thanks Andrey and Tyler! That was useful :)

Do you guys have any idea why the 10 MB writes took a lot of time in my case although I'm using Large VMs which have plenty of resources? Or do you think this latency is expected?
I'm trying to see how much time is spent in the network versus processing CPU cycles of the nodes; any suggestion for a good profiling tool?



On Thu, Jul 18, 2013 at 5:50 PM, Tyler Hobbs <tyler@datastax.com> wrote:
The default limit is 16mb, but realistically you should try to keep writes under 10mb, breaking up large values into multiple columns/rows if necessary.


On Thu, Jul 18, 2013 at 4:31 PM, Andrey Ilinykh <ailinykh@gmail.com> wrote:
there is a limit of thrift message ( thrift_max_message_length_in_mb), by default it is 64m if I'm not mistaken. This is your limit.


On Thu, Jul 18, 2013 at 2:03 PM, hajjat <hajjat@purdue.edu> wrote:
Hi,

Is there a recommended data size for Reads/Writes in Cassandra? I tried
inserting 10 MB objects and the latency I got was pretty high. Also, I was
never able to insert larger objects (say 50 MB) since Cassandra kept
crashing when I tried that.

Here is my experiment setup:
I used two Large VMs in EC2 within the same data-center. Inserts have ALL
consistency (strong consistency).  The latencies were as follows:
Data size:      10 MB           1 MB            100 Bytes
Latency:        250ms           50ms            8ms

I've also done the same for two Large VMs across two data-centers. The
latencies were around:
Data size:      10 MB           1 MB            100 Bytes
Latency:        1200ms          800ms   80ms

1) Ain't the 10 MB latency extremely high?
2) Is there a recommended data size to use with Cassandra (e.g., a few bytes
up to 1 MB)?
3) Also, I tried inserting 50 MB data but Cassandra kept crashing. Does
anybody know why? I thought the max data size should be up to 2 GB?

Thanks,
Mohammad

PS. Here is my python code I use to insert into Cassandra. I put my
stopwatch timers around the insert statement:
    fh = open(TEST_FILE,'r')
    data = str(fh.read())

    POOL = ConnectionPool(keyspace, server_list=['localhost:9160'],
timeout=None)
    USER = ColumnFamily(POOL, 'User')
    USER.insert('Ali', {'data':
data},write_consistency_level=pycassa.cassandra.ttypes.ConsistencyLevel.ALL)




--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Recommended-data-size-for-Reads-Writes-in-Cassandra-tp7589141.html
Sent from the cassandra-user@incubator.apache.org mailing list archive at Nabble.com.




--
Tyler Hobbs
DataStax



--
Mohammad Hajjat
Ph.D. Student
Electrical and Computer Engineering
Purdue University