Hi all

I'm working on a PHP/Cassandra application. Yesterday we experienced a strange situation when testing random reads. The background to this test was that we inserted 10,000 rows with simple row keys. The number of columns in each row varies between about 5 columns and 40 columns (all random).  Insert speed via the PHP Thrift library was good - very fast.

With our random read script, we are simply carrying out a get_slice on a specific row key to fetch all the columns in the row (up to 100, which is always everything).

The random read script either completes in a few ms, or it takes around 5s. The significance of 5s is that this is (temporarily) what we've set the socket read timeout to (TSocket::setRecvTimeout). There are a couple of interesting things to note:

1. Even when the socket seemingly times out, the result of the operation is successful - eg: we do end up with a row returned.
2. The slow response times are *always* for rows with a larger number of columns; the limit seems to be around 9 columns or so
3. If we up the timeout to 10s, the reads take 10s instead of 5s, but still return successfully

This is an example of data from a row that is read quickly

    col0: e0ce34010211030d91331feefc946c8c
    col1: 6d97b892ee7773d40b7bbff27ec5b34d
    col2: 6e2394dd48d5ca2df47eeceb72ca9de0
    col3: 43fca4716b865f24e30de67b2e10c1a8
    col4: c2e8de1541550e78829d312609acd237
    col5: e458447d8a2987bf05d65bee0a103be8
    col6: 0a8a86de4247b690e765aeca6615aef8
    col7: dc48b5e996da86b94d40d85292351c61
    col8: 3b95f9fc7c64d021ecc2c7a013f2e132
    key: 9bc905c5-fc62-58de-87f5-48eb1ebb4f03

This is an example of data from a row that is read slowly:

    col0: 113b9cfe8eea8bf7eca71ce1ca1b0913
    col1: 428fe0bfadf687ef3b5c532e98e487ef
    col10: f1e80507626223358414130b1c7ecacd
    col11: cf7ada7ab098d2aeb9e5553808c89044
    col12: a93237313167c313d36d39779dcf23cd
    col13: 609595bbbb2b7058ad3f97f1ea0b7ebd
    col2: 27eca7dbff849eac82dc32e92b3fe977
    col3: 294dbf3107c351783a69450fedbefc61
    col4: 7fbd8f20d52731a10029e6f92874fae5
    col5: 3d06fc491c8f1669b144b798155578d4
    col6: 60d8f358cf07924912c8e19c60f45aac
    col7: 03297fd9576c1c96586bbbaaeaa1aa64
    col8: 6d6383fba84a6ec6811d96aee2b39102
    col9: c937171c3ad5d30b671d72741a777dc7
    key: 7318f337-529d-5408-a7c8-1283b750164d

Other points:

- we are reading with CL:One
- we are using the PHP Thrift library directly
- we are using:
   TBufferedTransport                  [with buffer sizes of 1024, 1024]

I have just this second discovered that changing the buffer sizes impacts this issue. Reducing the buffer size makes every request take 5s, increasing the buffer size makes every request execute quickly.

I will continue to debug this issue today, but thought that someone may be able to shed some light on the issue.