cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Duncan Sands <duncan.sa...@gmail.com>
Subject Re: Really high read latency
Date Mon, 23 Mar 2015 08:22:51 GMT
Hi Dave,

On 23/03/15 05:56, Dave Galbraith wrote:
> Hi! So I've got a table like this:
>
> CREATE TABLE "default".metrics (row_time int,attrs varchar,offset int,value
> double, PRIMARY KEY(row_time, attrs, offset)) WITH COMPACT STORAGE AND
> bloom_filter_fp_chance=0.01 AND caching='KEYS_ONLY' AND comment='' AND
> dclocal_read_repair_chance=0 AND gc_grace_seconds=864000 AND index_interval=128
> AND read_repair_chance=1 AND replicate_on_write='true' AND
> populate_io_cache_on_flush='false' AND default_time_to_live=0 AND
> speculative_retry='NONE' AND memtable_flush_period_in_ms=0 AND
> compaction={'class':'DateTieredCompactionStrategy','timestamp_resolution':'MILLISECONDS'}
> AND compression={'sstable_compression':'LZ4Compressor'};

does it work better with
   PRIMARY KEY((row_time, attrs), offset)
?

Ciao, Duncan.

>
> and I'm running Cassandra on an EC2 m3.2xlarge out in the cloud, with 4 GB of
> heap space. So it's timeseries data that I'm doing so I increment "row_time"
> each day, "attrs" is additional identifying information about each series, and
> "offset" is the number of milliseconds into the day for each data point. So for
> the past 5 days, I've been inserting 3k points/second distributed across 100k
> distinct "attrs"es. And now when I try to run queries on this data that look like
>
> "SELECT * FROM "default".metrics WHERE row_time = 5 AND attrs = 'potatoes_and_jam'"
>
> it takes an absurdly long time and sometimes just times out. I did "nodetool
> cftsats default" and here's what I get:
>
> Keyspace: default
>      Read Count: 59
>      Read Latency: 397.12523728813557 ms.
>      Write Count: 155128
>      Write Latency: 0.3675690719921613 ms.
>      Pending Flushes: 0
>          Table: metrics
>          SSTable count: 26
>          Space used (live): 35146349027
>          Space used (total): 35146349027
>          Space used by snapshots (total): 0
>          SSTable Compression Ratio: 0.10386468749216264
>          Memtable cell count: 141800
>          Memtable data size: 31071290
>          Memtable switch count: 41
>          Local read count: 59
>          Local read latency: 397.126 ms
>          Local write count: 155128
>          Local write latency: 0.368 ms
>          Pending flushes: 0
>          Bloom filter false positives: 0
>          Bloom filter false ratio: 0.00000
>          Bloom filter space used: 2856
>          Compacted partition minimum bytes: 104
>          Compacted partition maximum bytes: 36904729268
>          Compacted partition mean bytes: 986530969
>          Average live cells per slice (last five minutes): 501.66101694915255
>          Maximum live cells per slice (last five minutes): 502.0
>          Average tombstones per slice (last five minutes): 0.0
>          Maximum tombstones per slice (last five minutes): 0.0
>
> Ouch! 400ms of read latency, orders of magnitude higher than it has any right to
> be. How could this have happened? Is there something fundamentally broken about
> my data model? Thanks!
>


Mime
View raw message