incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Philippe <watche...@gmail.com>
Subject Re: Scalability question
Date Tue, 16 Aug 2011 14:15:12 GMT
Teijo,

Unfortunately my data set really does grow because it s a time series. I'm
going to add a trick to aggregate old data but it will still grow.

How often do you repair per day (or is it really continuous ?)
I've been running experiments and I wonder if your decision to perform
continuous repairs may not stem from what I observe : I emptied a keyspace
and started loading data into it (about 18.000 mutations/s). Every time I
run a repair on that keyspace I get out of sync ranges.
I just don't see how that is possible given that
 - none of the nodes are going down
 - tpstats shows only occasional backlog on the nodes (up to 2000 pending
max)

Even weirder : when not writing to the keyspace, it took 4 consecutive
repairs to not have any out of sync ranges anymore. Is repair probabilistic
?

My CFs are created on the following template

create column family PUBLIC_MONTHLY_20

with column_type = Super

with comparator = UTF8Type

with subcomparator = BytesType

and min_compaction_threshold=2 and read_repair_chance=0

and keys_cached = 20

and rows_cached = 50

and default_validation_class = CounterColumnType and
replicate_on_write=true;

Philippe

2011/8/16 Teijo Holzer <tholzer@wetafx.co.nz>

> Hi,
>
> we have come across this as well. We run continuously run rolling repairs
> followed by major compactions followed by a gc() (or node restart) to get
> rid of all these sstables files. Combined with aggressive ttls on most
> inserts, the cluster stays nice and lean.
>
> You don't want your working set to grow indefinitely.
>
> Cheers,
>
>        T.
>
>
>
> On 16/08/11 08:08, Philippe wrote:
>
>> Forgot to mention that stopping & restarting the server brought the data
>> directory down to 283GB in less than 1 minute.
>>
>> Philippe
>> 2011/8/15 Philippe <watcherfr@gmail.com <mailto:watcherfr@gmail.com>>
>>
>>
>>        It's another reason to avoid major / manual compactions which
>> create a
>>        single big SSTable. Minor compactions keep things in buckets
>> which
>>        means newer SSTable can be compacted needing to read the bigger
>> older
>>        tables.
>>
>>    I've never run a major/manual compaction on this ring.
>>    In my case running repair on a "big" keyspace results in SSTables
>> piling
>>    up. My problematic node just filled up 483GB (yes, GB) of SSTTables.
>> Here
>>    are the biggest
>>    ls -laSrh
>>    (...)
>>
>>    -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:13
>>    PUBLIC_MONTHLY_20-g-4581-Data.**db
>>
>>    -rw-r--r-- 1 cassandra cassandra  2.7G 2011-08-15 14:52
>>    PUBLIC_MONTHLY_20-g-4641-Data.**db
>>
>>    -rw-r--r-- 1 cassandra cassandra  2.8G 2011-08-15 14:39
>>    PUBLIC_MONTHLY_20-tmp-g-4878-**Data.db
>>
>>    -rw-r--r-- 1 cassandra cassandra  2.9G 2011-08-15 15:00
>>    PUBLIC_MONTHLY_20-g-4656-Data.**db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 14:17
>>    PUBLIC_MONTHLY_20-g-4599-Data.**db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.0G 2011-08-15 15:11
>>    PUBLIC_MONTHLY_20-g-4675-Data.**db
>>
>>    -rw-r--r-- 3 cassandra cassandra  3.1G 2011-08-13 10:34
>>    PUBLIC_MONTHLY_18-g-3861-Data.**db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.2G 2011-08-15 14:41
>>    PUBLIC_MONTHLY_20-tmp-g-4884-**Data.db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.6G 2011-08-15 14:44
>>    PUBLIC_MONTHLY_20-tmp-g-4894-**Data.db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:56
>>    PUBLIC_MONTHLY_20-tmp-g-4934-**Data.db
>>
>>    -rw-r--r-- 1 cassandra cassandra  3.8G 2011-08-15 14:46
>>    PUBLIC_MONTHLY_20-tmp-g-4905-**Data.db
>>
>>    -rw-r--r-- 1 cassandra cassandra  4.0G 2011-08-15 14:57
>>    PUBLIC_MONTHLY_20-tmp-g-4935-**Data.db
>>
>>    -rw-r--r-- 3 cassandra cassandra  5.9G 2011-08-13 12:53
>>    PUBLIC_MONTHLY_19-g-4219-Data.**db
>>
>>    -rw-r--r-- 3 cassandra cassandra  6.0G 2011-08-13 13:57
>>    PUBLIC_MONTHLY_20-g-4538-Data.**db
>>
>>    -rw-r--r-- 3 cassandra cassandra   12G 2011-08-13 09:27
>>    PUBLIC_MONTHLY_20-g-4501-Data.**db
>>
>>
>>    On the other nodes the same directory is around 69GB. Why are there so
>>    fewer large files there and so many big ones on the repairing node ?
>>      -rw-r--r-- 1 cassandra cassandra 434M 2011-08-15 16:02
>>    PUBLIC_MONTHLY_17-g-3525-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 456M 2011-08-15 15:50
>>    PUBLIC_MONTHLY_19-g-4253-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 485M 2011-08-15 14:30
>>    PUBLIC_MONTHLY_20-g-5280-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 572M 2011-08-15 15:15
>>    PUBLIC_MONTHLY_18-g-3774-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 664M 2011-08-09 15:39
>>    PUBLIC_MONTHLY_20-g-4893-**Index.db
>>    -rw-r--r-- 2 cassandra cassandra 811M 2011-08-11 21:27
>>    PUBLIC_MONTHLY_16-g-2597-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 915M 2011-08-13 04:00
>>    PUBLIC_MONTHLY_18-g-3695-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 925M 2011-08-15 03:39
>>    PUBLIC_MONTHLY_17-g-3454-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 1.3G 2011-08-15 13:46
>>    PUBLIC_MONTHLY_19-g-4199-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 1.5G 2011-08-10 15:37
>>    PUBLIC_MONTHLY_17-g-3218-Data.**db
>>    -rw-r--r-- 1 cassandra cassandra 1.9G 2011-08-15 14:35
>>    PUBLIC_MONTHLY_20-g-5281-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 2.1G 2011-08-10 16:33
>>    PUBLIC_MONTHLY_19-g-3946-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 3.1G 2011-08-10 22:23
>>    PUBLIC_MONTHLY_18-g-3509-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 4.0G 2011-08-10 18:18
>>    PUBLIC_MONTHLY_20-g-5024-Data.**db
>>    -rw------- 2 cassandra cassandra 5.1G 2011-08-09 15:23
>>    PUBLIC_MONTHLY_19-g-3847-Data.**db
>>    -rw-r--r-- 2 cassandra cassandra 9.6G 2011-08-09 15:39
>>    PUBLIC_MONTHLY_20-g-4893-Data.**db
>>
>>    This whole compaction thing is getting me worried : how are sites in
>>    production dealing with SSTables becoming larger and larger and thus
>> taking
>>    longer and longer to compact ? Adding nodes every couple of weeks ?
>>
>>    Philippe
>>
>>
>>
>

Mime
View raw message