incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Yang <teddyyyy...@gmail.com>
Subject Re: is it possible for light-traffic CF to hold down many commit logs?
Date Fri, 23 Sep 2011 07:18:12 GMT
Thanks Sylvain, this is exactly what I need.




On Fri, Sep 23, 2011 at 12:10 AM, Sylvain Lebresne <sylvain@datastax.com> wrote:
> In 1.0.0, you have:
>
> # Total space to use for commitlogs.
> # If space gets above this value (it will round up to the next nearest
> # segment multiple), Cassandra will flush every dirty CF in the oldest
> # segment and remove it.
> # commitlog_total_space_in_mb: 4096
>
> In 0.8, you're supposed to use the memtableFlushAfterMins property
> for each CF to avoid filling up your commit log partition. Which is a
> little more involved, but that is why we have improved that in 1.0.
>
> --
> Sylvain
>
>
> On Fri, Sep 23, 2011 at 7:47 AM, Yang <teddyyyy123@gmail.com> wrote:
>> thanks for the input.
>>
>> if that's the case, I think the solution would be to sort the CFs to
>> flush by a more complex criteria than just size. for example the
>> number of dirty commit logs that contain this CF should be considered
>> as a score.
>>
>> Yang
>>
>> On Thu, Sep 22, 2011 at 10:40 PM, Philippe <watcherfr@gmail.com> wrote:
>>> It sure looks like what I'm seeing on my cluster where a 100G commit lot
>>> partition fills up in 12 hours (0.8.x)
>>>
>>> Le 23 sept. 2011 03:45, "Yang" <teddyyyy123@gmail.com> a écrit :
>>>> in 1.0.0 we don't have memtable_throughput for each individual CF ,
>>>> and instead
>>>> which memtable/CF to flush is determined by "largest
>>>> getTotalMemtableLiveSize() ".
>>>> (MeteredFlusher.java line 81)
>>>>
>>>>
>>>> what would happen in the following case ? : I have only 2 CF, the
>>>> traffic for one CF is 1000 times that
>>>> of the second CF,
>>>> so the high-traffic CF constantly triggers total mem threshold , and
>>>> every time, the busy CF is flushed.
>>>>
>>>> but the light-traffic CF is never flushed ( well, until we have
>>>> flushed about 1000 times the busy CF),
>>>> now we are left with many commit logs , each of them containing a few
>>>> entries for the light-traffic table. we have to keep these commit logs
>>>> because these entries are not flushed to sstable yet.
>>>>
>>>> then there are 2 problems: 1) to persist the few records from the
>>>> light-traffic CF, you have to keep 1000 times the commit logs
>>>> necessary, taking up disk space 2) when you do a recover on server
>>>> restart, you'll have to read through all those commit logs .
>>>>
>>>> does the above hypothesis sound right?
>>>>
>>>> Thanks
>>>> Yang
>>>
>>
>

Mime
View raw message