incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yiming Sun <yiming....@gmail.com>
Subject Re: Cassandra read throughput with little/no caching.
Date Fri, 21 Dec 2012 16:27:12 GMT
James, using RandomPartitioner, the order of the rows is random, so when
you request these rows in "Sequential" order (sort by the date?), Cassandra
is not reading them sequentially.

The size of the data, 200Mb, 300Mb , and 40Mb, are these the size for each
column? Or are these the total size of the entire column family?  It wasn't
too clear to me.  But if these are the total size of the column families,
you will be able to fit them mostly in memory, so you should enable row
cache.

I happen to have done some performance tests of my own on cassandra, mostly
on the read, and was also only able to get less than 6MB/sec read rate out
of a cluster of 6 nodes RF2 using a single threaded client.  But it makes a
huge difference when I changed the client to an asynchronous multi-threaded
structure.




On Fri, Dec 21, 2012 at 10:36 AM, James Masson <james.masson@opigram.com>wrote:

>
> Hi,
>
> thanks for the reply
>
>
> On 21/12/12 14:36, Yiming Sun wrote:
>
>> I have a few questions for you, James,
>>
>> 1. how many nodes are in your Cassandra ring?
>>
>
> 2 or 3 - depending on environment - it doesn't seem to make a difference
> to throughput very much. What is a 30 minute task on a 2 node environment
> is a 30 minute task on a 3 node environment.
>
>
>  2. what is the replication factor?
>>
>
> 1
>
>  3. when you say sequentially, what do you mean?  what Partitioner do you
>> use?
>>
>
> The data is organised by date - the keys are read sequentially in order,
> only once.
>
> Random partitioner - the data is equally spread across the nodes to avoid
> hotspots.
>
>
>  4. how many columns per row?  how much data per row?  per column?
>>
>
> varies - described in the schema.
>
> create keyspace mykeyspace
>   with placement_strategy = 'SimpleStrategy'
>   and strategy_options = {replication_factor : 1}
>   and durable_writes = true;
>
>
> create column family entities
>   with column_type = 'Standard'
>   and comparator = 'BytesType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'AsciiType'
>   and read_repair_chance = 0.0
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 0
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = false
>   and compaction_strategy = 'org.apache.cassandra.db.**compaction.**
> SizeTieredCompactionStrategy'
>   and caching = 'NONE'
>   and column_metadata = [
>     {column_name : '64656c65746564',
>     validation_class : BytesType,
>     index_name : 'deleted_idx',
>     index_type : 0},
>     {column_name : '6576656e744964',
>     validation_class : TimeUUIDType,
>     index_name : 'eventId_idx',
>     index_type : 0},
>     {column_name : '7061796c6f6164',
>     validation_class : UTF8Type}];
>
> 2 columns per row here - about 200Mb of data in total
>
>
> create column family events
>   with column_type = 'Standard'
>   and comparator = 'BytesType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'TimeUUIDType'
>   and read_repair_chance = 0.0
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 0
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = false
>   and compaction_strategy = 'org.apache.cassandra.db.**compaction.**
> SizeTieredCompactionStrategy'
>   and caching = 'NONE';
>
> 1 column per row - about 300Mb of data
>
> create column family intervals
>   with column_type = 'Standard'
>   and comparator = 'BytesType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'AsciiType'
>   and read_repair_chance = 0.0
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 0
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = false
>   and compaction_strategy = 'org.apache.cassandra.db.**compaction.**
> SizeTieredCompactionStrategy'
>   and caching = 'NONE';
>
> variable columns per row - about 40Mb of data.
>
>
>
>  5. what client library do you use to access Cassandra?  (Hector?).  Is
>> your client code single threaded?
>>
>
> Hector - yes, the processing side of the client is single threaded, but is
> largely waiting for cassandra responses and has plenty of CPU headroom.
>
>
> I guess what I'm most interested in is why the discrepancy in between
> read/write latency - although I understand the data volume is much larger
> in reads, even though the request rate is lower.
>
> Network usage on a cassandra box barely gets above 20Mbit, including
> inter-cluster comms. Averages 5mbit client<>cassandra
>
> There is near zero disk I/O, and what little there is is served sub 1ms.
> Storage is backed by a very fast SAN, but like I said earlier, the dataset
> just about fits in the Linux disk cache. 2Gb VM, 512Mb cassandra heap - GCs
> are nice and quick, no JVM memory problems, used heap oscillates between
> 280-350Mb.
>
> Basically, I'm just puzzled as cassandra doesn't behave as I would expect.
> Huge CPU use in cassandra for very little throughput. I'm struggling to
> find anything that's wrong with the environment, there's no bottleneck that
> I can see.
>
> thanks
>
> James M
>
>
>
>
>>
>> On Fri, Dec 21, 2012 at 7:27 AM, James Masson <james.masson@opigram.com
>> <mailto:james.masson@opigram.**com <james.masson@opigram.com>>> wrote:
>>
>>
>>     Hi list-users,
>>
>>     We have an application that has a relatively unusual access pattern
>>     in cassandra 1.1.6
>>
>>     Essentially we read an entire multi hundred megabyte column family
>>     sequentially (little chance of a cassandra cache hit), perform some
>>     operations on the data, and write the data back to another column
>>     family in the same keyspace.
>>
>>     We do about 250 writes/sec and 100 reads/sec during this process.
>>     Write request latency is about 900 microsecs, read request latency
>>     is about 4000 microsecs.
>>
>>     * First Question: Do these numbers make sense?
>>
>>     read-request latency seems a little high to me, cassandra hasn't had
>>     a chance to cache this data, but it's likely in the Linux disk
>>     cache, given the sizing of the node/data/jvm.
>>
>>     thanks
>>
>>     James M
>>
>>
>>

Mime
View raw message