incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andrew Rollins <and...@localytics.com>
Subject Re: Write assurance in Cassandra
Date Sun, 04 Jul 2010 08:11:06 GMT
Is your IO under heavy load? If it is, that may be the cause, otherwise I'm
not sure what causes significant lag. On Linux I like to use "iostat -tx 10"
to check IO.

- Andrew


On Sun, Jul 4, 2010 at 4:04 AM, David Boxenhorn <david@lookin2.com> wrote:

> Thank you very much! I now understand things much better.
>
> However, my configuration is as follows:
>
>   <CommitLogSync>periodic</CommitLogSync>
>   <CommitLogSyncPeriodInMS>10000</CommitLogSyncPeriodInMS>
>
> So I should see my commit log change after 10,000 milliseconds = 10
> seconds? It seems to take much longer to show up.
>
> On Sun, Jul 4, 2010 at 10:52 AM, Andrew Rollins <andrew@localytics.com>wrote:
>
>> By default Cassandra syncs the commit log to disk periodically, so if you
>> are looking at file sizes, you won't see the most up to date numbers. This
>> is just like how if you tail a file that isn't flushing frequently, you
>> might wait a little while before you see the updates.
>>
>> In periodic mode, Cassandra acknowledges the write to the client
>> immediately (even before it is synced). You can run Cassandra in batch mode
>> instead, which basically means it writes in batches *and* it won't
>> acknowledge the writes to the client until it has actually synced. I'm still
>> somewhat new to this, but that's my understanding.
>>
>> Have a look at CommitLogSync in your storage-conf.xml for more info about
>> setting up syncing periods.
>>
>> As an aside, I'm not sure why the "ack immediately" or "ack after sync"
>> setting is piggybacked on the periodic vs batch setting. At first glance it
>> seems like concepts should be independent of one another.
>>
>> - Andrew
>>
>>
>> On Sun, Jul 4, 2010 at 3:34 AM, David Boxenhorn <david@lookin2.com>wrote:
>>
>>> As I understand it, when you write to Cassandra, you are assured that, if
>>> successful, the new data has been written to a log file - so that if there
>>> is a crash your data is safe. Is this correct?
>>>
>>> If the above is correct, there is something going on that I don't
>>> understand. Are the log files to which the data is first written the ones
>>> that look like /var/lib/cassandra/commitlog/CommitLog-1277998453387.log ?
>>> The reason I ask is that when I write a lot of data, nothing seems to change
>>> in the commitlog directory for a long time, then at some point the log files
>>> in this directory get updated. It looks to me like there's memory caching
>>> involved, and the new data is not being immediately written to disk. What is
>>> going on?
>>>
>>
>>
>

Mime
View raw message