incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From aaron morton <aa...@thelastpickle.com>
Subject Re: huge commitlog
Date Sun, 25 Nov 2012 19:52:00 GMT
> I checked the log, and found some ERROR about network problems,
> and some ERROR about "Keys must not be empty".
Do you have the full error stack ?

Cheers

-----------------
Aaron Morton
Freelance Cassandra Developer
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 25/11/2012, at 4:14 AM, Chuan-Heng Hsiao <hsiao.chuanheng@gmail.com> wrote:

> Hi Cassandra Devs,
> 
> After trying to setup the same settings (and importing same data)
> to the 3 VMs on the same machine instead of 3 physical machines,
> so far I couldn't replicate the exploded-commitlog situation.
> 
> On my 4-physical-machine setting, everything seems to be
> back to normal (commitlog size is less than the expected max setting)
> after restarting the nodes.
> 
> This time the size of the commitlog of one node is set as 4G, and the
> others are set as 8G.
> 
> Few days ago the node with 4G got exploded as 5+G. (the 8G nodes remain at 8G).
> I checked the log, and found some ERROR about network problems,
> and some ERROR about "Keys must not be empty".
> 
> I suspect that besides the network problems,
> the "Keys must not be empty" ERROR may be the main reason why
> the commitlog continues growing.
> (I've already ensured that the Keys must not be empty in my code,
> so the problem may be raised when syncing internally in cassandra.)
> 
> I restarted the 4G node as 8G node. Because there is no huge traffic since
> then, I am not sure whether increasing the commitlog size will
> solve/reduce this problem or not yet.
> I'll keep you posted once the commitlog get expldoed again.
> 
> Sincerely,
> Hsiao
> 
> 
> On Mon, Nov 19, 2012 at 11:21 AM, Chuan-Heng Hsiao
> <hsiao.chuanheng@gmail.com> wrote:
>> I have RF = 3. Read/Write consistency has already been set as TWO.
>> 
>> It did seem that the data were not consistent yet.
>> (There are some CFs that I expected empty after the operations, but still
>> got some data, and the number of data were decreasing after retrying
>> to get all data
>> from that CF)
>> 
>> Sincerely,
>> Hsiao
>> 
>> 
>> On Mon, Nov 19, 2012 at 11:14 AM, Tupshin Harper <tupshin@tupshin.com> wrote:
>>> What consistency level are you writing with? If you were writing with ANY,
>>> try writing with a higher consistency level.
>>> 
>>> -Tupshin
>>> 
>>> On Nov 18, 2012 9:05 PM, "Chuan-Heng Hsiao" <hsiao.chuanheng@gmail.com>
>>> wrote:
>>>> 
>>>> Hi Aaron,
>>>> 
>>>> Thank you very much for the replying.
>>>> 
>>>> The 700 CFs were created in the beginning (before any insertion.)
>>>> 
>>>> I did not do anything with commitlog_archiving.properties, so I guess
>>>> I was not using commit log archiving.
>>>> 
>>>> What I did was doing a lot of insertions (and some deletions)
>>>> using another 4 machines with 32 processes in total.
>>>> (There are 4 nodes in my setting, so there are 8 machines in total)
>>>> 
>>>> I did see huge logs in /var/log/cassandra after such huge amount of
>>>> insertions.
>>>> Right now I  can't distinguish whether single insertion also cause huge
>>>> logs.
>>>> 
>>>> nodetool flush hanged (maybe because of 200G+ commitlog)
>>>> 
>>>> Because these machines are not in production (guaranteed no more
>>>> insertion/deletion)
>>>> I ended up restarting cassandra one node each time, the commitlog
>>>> shrinked back to
>>>> 4G. I am doing repair on each node now.
>>>> 
>>>> I'll try to re-import and keep logs when the commitlog increases insanely
>>>> again.
>>>> 
>>>> Sincerely,
>>>> Hsiao
>>>> 
>>>> 
>>>> On Mon, Nov 19, 2012 at 3:19 AM, aaron morton <aaron@thelastpickle.com>
>>>> wrote:
>>>>> I am wondering whether the huge commitlog size is the expected behavior
>>>>> or
>>>>> not?
>>>>> 
>>>>> Nope.
>>>>> 
>>>>> Did you notice the large log size during or after the inserts ?
>>>>> If after did the size settle ?
>>>>> Are you using commit log archiving ? (in commitlog_archiving.properties)
>>>>> 
>>>>> and around 700 mini column family (around 10M in data_file_directories)
>>>>> 
>>>>> Can you describe how you created the 700 CF's ?
>>>>> 
>>>>> and how can we reduce the size of commitlog?
>>>>> 
>>>>> As a work around nodetool flush should checkpoint the log.
>>>>> 
>>>>> Cheers
>>>>> 
>>>>> -----------------
>>>>> Aaron Morton
>>>>> Freelance Cassandra Developer
>>>>> New Zealand
>>>>> 
>>>>> @aaronmorton
>>>>> http://www.thelastpickle.com
>>>>> 
>>>>> On 17/11/2012, at 2:30 PM, Chuan-Heng Hsiao <hsiao.chuanheng@gmail.com>
>>>>> wrote:
>>>>> 
>>>>> hi Cassandra Developers,
>>>>> 
>>>>> I am experiencing huge commitlog size (200+G) after inserting huge
>>>>> amount of data.
>>>>> It is a 4-node cluster with RF= 3, and currently each has 200+G commit
>>>>> log (so there are around 1T commit log in total)
>>>>> 
>>>>> The setting of commitlog_total_space_in_mb is default.
>>>>> 
>>>>> I am using 1.1.6.
>>>>> 
>>>>> I did not do nodetool cleanup and nodetool flush yet, but
>>>>> I did nodetool repair -pr for each column family.
>>>>> 
>>>>> There is 1 huge column family (around 68G in data_file_directories),
>>>>> and 18 mid-huge column family (around 1G in data_file_directories)
>>>>> and around 700 mini column family (around 10M in data_file_directories)
>>>>> 
>>>>> I am wondering whether the huge commitlog size is the expected behavior
>>>>> or
>>>>> not?
>>>>> and how can we reduce the size of commitlog?
>>>>> 
>>>>> Sincerely,
>>>>> Hsiao
>>>>> 
>>>>> 


Mime
View raw message