incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sylvain Lebresne <sylv...@datastax.com>
Subject Re: is it possible for light-traffic CF to hold down many commit logs?
Date Fri, 23 Sep 2011 07:10:18 GMT
In 1.0.0, you have:

# Total space to use for commitlogs.
# If space gets above this value (it will round up to the next nearest
# segment multiple), Cassandra will flush every dirty CF in the oldest
# segment and remove it.
# commitlog_total_space_in_mb: 4096

In 0.8, you're supposed to use the memtableFlushAfterMins property
for each CF to avoid filling up your commit log partition. Which is a
little more involved, but that is why we have improved that in 1.0.

--
Sylvain


On Fri, Sep 23, 2011 at 7:47 AM, Yang <teddyyyy123@gmail.com> wrote:
> thanks for the input.
>
> if that's the case, I think the solution would be to sort the CFs to
> flush by a more complex criteria than just size. for example the
> number of dirty commit logs that contain this CF should be considered
> as a score.
>
> Yang
>
> On Thu, Sep 22, 2011 at 10:40 PM, Philippe <watcherfr@gmail.com> wrote:
>> It sure looks like what I'm seeing on my cluster where a 100G commit lot
>> partition fills up in 12 hours (0.8.x)
>>
>> Le 23 sept. 2011 03:45, "Yang" <teddyyyy123@gmail.com> a écrit :
>>> in 1.0.0 we don't have memtable_throughput for each individual CF ,
>>> and instead
>>> which memtable/CF to flush is determined by "largest
>>> getTotalMemtableLiveSize() ".
>>> (MeteredFlusher.java line 81)
>>>
>>>
>>> what would happen in the following case ? : I have only 2 CF, the
>>> traffic for one CF is 1000 times that
>>> of the second CF,
>>> so the high-traffic CF constantly triggers total mem threshold , and
>>> every time, the busy CF is flushed.
>>>
>>> but the light-traffic CF is never flushed ( well, until we have
>>> flushed about 1000 times the busy CF),
>>> now we are left with many commit logs , each of them containing a few
>>> entries for the light-traffic table. we have to keep these commit logs
>>> because these entries are not flushed to sstable yet.
>>>
>>> then there are 2 problems: 1) to persist the few records from the
>>> light-traffic CF, you have to keep 1000 times the commit logs
>>> necessary, taking up disk space 2) when you do a recover on server
>>> restart, you'll have to read through all those commit logs .
>>>
>>> does the above hypothesis sound right?
>>>
>>> Thanks
>>> Yang
>>
>

Mime
View raw message