incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chuan-Heng Hsiao <hsiao.chuanh...@gmail.com>
Subject Re: huge commitlog
Date Mon, 19 Nov 2012 03:21:25 GMT
I have RF = 3. Read/Write consistency has already been set as TWO.

It did seem that the data were not consistent yet.
(There are some CFs that I expected empty after the operations, but still
 got some data, and the number of data were decreasing after retrying
to get all data
 from that CF)

Sincerely,
Hsiao


On Mon, Nov 19, 2012 at 11:14 AM, Tupshin Harper <tupshin@tupshin.com> wrote:
> What consistency level are you writing with? If you were writing with ANY,
> try writing with a higher consistency level.
>
> -Tupshin
>
> On Nov 18, 2012 9:05 PM, "Chuan-Heng Hsiao" <hsiao.chuanheng@gmail.com>
> wrote:
>>
>> Hi Aaron,
>>
>> Thank you very much for the replying.
>>
>> The 700 CFs were created in the beginning (before any insertion.)
>>
>> I did not do anything with commitlog_archiving.properties, so I guess
>> I was not using commit log archiving.
>>
>> What I did was doing a lot of insertions (and some deletions)
>> using another 4 machines with 32 processes in total.
>> (There are 4 nodes in my setting, so there are 8 machines in total)
>>
>> I did see huge logs in /var/log/cassandra after such huge amount of
>> insertions.
>> Right now I  can't distinguish whether single insertion also cause huge
>> logs.
>>
>> nodetool flush hanged (maybe because of 200G+ commitlog)
>>
>> Because these machines are not in production (guaranteed no more
>> insertion/deletion)
>> I ended up restarting cassandra one node each time, the commitlog
>> shrinked back to
>> 4G. I am doing repair on each node now.
>>
>> I'll try to re-import and keep logs when the commitlog increases insanely
>> again.
>>
>> Sincerely,
>> Hsiao
>>
>>
>> On Mon, Nov 19, 2012 at 3:19 AM, aaron morton <aaron@thelastpickle.com>
>> wrote:
>> > I am wondering whether the huge commitlog size is the expected behavior
>> > or
>> > not?
>> >
>> > Nope.
>> >
>> > Did you notice the large log size during or after the inserts ?
>> > If after did the size settle ?
>> > Are you using commit log archiving ? (in commitlog_archiving.properties)
>> >
>> > and around 700 mini column family (around 10M in data_file_directories)
>> >
>> > Can you describe how you created the 700 CF's ?
>> >
>> > and how can we reduce the size of commitlog?
>> >
>> > As a work around nodetool flush should checkpoint the log.
>> >
>> > Cheers
>> >
>> > -----------------
>> > Aaron Morton
>> > Freelance Cassandra Developer
>> > New Zealand
>> >
>> > @aaronmorton
>> > http://www.thelastpickle.com
>> >
>> > On 17/11/2012, at 2:30 PM, Chuan-Heng Hsiao <hsiao.chuanheng@gmail.com>
>> > wrote:
>> >
>> > hi Cassandra Developers,
>> >
>> > I am experiencing huge commitlog size (200+G) after inserting huge
>> > amount of data.
>> > It is a 4-node cluster with RF= 3, and currently each has 200+G commit
>> > log (so there are around 1T commit log in total)
>> >
>> > The setting of commitlog_total_space_in_mb is default.
>> >
>> > I am using 1.1.6.
>> >
>> > I did not do nodetool cleanup and nodetool flush yet, but
>> > I did nodetool repair -pr for each column family.
>> >
>> > There is 1 huge column family (around 68G in data_file_directories),
>> > and 18 mid-huge column family (around 1G in data_file_directories)
>> > and around 700 mini column family (around 10M in data_file_directories)
>> >
>> > I am wondering whether the huge commitlog size is the expected behavior
>> > or
>> > not?
>> > and how can we reduce the size of commitlog?
>> >
>> > Sincerely,
>> > Hsiao
>> >
>> >

Mime
View raw message