cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ivan Georgiev <yngw...@bk.ru>
Subject Re: 0.7.2 slow memtables flushing
Date Mon, 21 Feb 2011 12:07:06 GMT
I did some very rough measurements in a desperate attempt to see if I 
can find the issue (if there is an issue).
Since I dont know the code base well enough i chose 
BufferedRandomAccessFile as my suspect, since it was rewritten from 
0.7.0 to 0.7.1

I did rough measurements on how many times write() and rebuffer() were 
called, and how much time was spent in them. The measurements
were printed out when close() on the file was called. Time measurements 
are rounded in seconds. I have included the measurements only for the
large files that were flushed to disk.

0.7.0
Total time in reBuffer : 0 seconds.
Total times called reBuffer : 2
Total time in write : 0 seconds.
Total times called write : 579374

0.7.2
Time spent in reBuffer : 67 seconds.
Times called reBuffer : 30888
Time spent in write : 0 seconds.
Times called write : 1884107

Seems like rebuffer is being called a lot more in 0.7.2 and thats where 
the performance degradation is coming from.

rebuffer can end up being called from write itself, but since the time 
spent in write is insignificant, that means this excessive
calling of rebuffer in 0.7.2 is not happening from the write path.

Ivan


On 20.2.2011 г. 05:21 ч., Ivan Georgiev wrote:
> Is it possible that the changes to the BufferedRandomAccessFile.java 
> could be causing the issue ?
> I think the most notable change there is using ByteBuffer instead of a 
> byte[] for the buffer.
>
> I will run more tests and see what comes out of it.
>
> Ivan
>
> On 20.2.2011 г. 05:03 ч., Jonathan Ellis wrote:
>> I guess you'll need to binary search through the 0.7.1 changes to find
>> what made the difference.  I can't think of any obvious candidates.
>>
>> On Sat, Feb 19, 2011 at 10:32 AM, Ivan Georgiev<yngwiie@bk.ru>  wrote:
>>> On 19.2.2011 г. 16:43 ч., Jonathan Ellis wrote:
>>>> Flush code didn't change between 0.7.0 and 0.7.2. There must be some
>>>> other variable here. Memory pressure maybe?
>>> Cloud you please elaborate on that one ?
>>> The conditions are exactly the same for the test with 0.7.0 and 0.7.2.
>>> By the way, 0.7.1 tests are similar to 0.7.2, while 0.7 early betas 
>>> to 0.7.0
>>> is fine.
>>>
>>> Ivan
>>>
>>> On 19.2.2011 г. 16:43 ч., Jonathan Ellis wrote:
>>>> Flush code didn't change between 0.7.0 and 0.7.2. There must be some
>>>> other variable here. Memory pressure maybe?
>>>>
>>>> On Sat, Feb 19, 2011 at 8:16 AM, Ivan Georgiev<yngwiie@bk.ru>    
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> I am testing 0.7.2 on a Windows 2003 x64 system(one node) and I am 
>>>>> having
>>>>> the following problem.
>>>>> My insertion speed is relatively slow, so the memtables do not get 
>>>>> full
>>>>> and
>>>>> the actual flushing is triggered by memtable_flush_after_mins, this
>>>>> happens
>>>>> on the hour mark. My problem with 0.7.2 is that when that happens, 
>>>>> the
>>>>> cpu
>>>>> spikes to 25% overall usage (x4 Xeon) and the operation takes 
>>>>> anywhere
>>>>> from
>>>>> 2 to 4 minutes, leaving the node not responding during that time. 
>>>>> This
>>>>> has
>>>>> forced to me to increase the rpc timeout option to beyond what i feel
>>>>> comfortable with.
>>>>> I have run multiple tests with 0.7.0 and 0.7.2 with the same 
>>>>> dataset and
>>>>> the
>>>>> results are consistent. During the same operation 0.7.0 takes 
>>>>> about 10
>>>>> seconds to complete vs. 2 to 4 minutes for 0.7.2. I am attaching a 
>>>>> log
>>>>> with
>>>>> the timestamps from one such flushing of 0.7.2. Please let me know if
>>>>> there
>>>>> is anything i can do to speed up and get results similar to 0.7.0.
>>>>>
>>>>> Regards:
>>>>> Ivan
>>>>>
>>>>> This is the log of the operations which took most time during the 
>>>>> flush
>>>>> operation. Using 0.7.0, with the same number of operations the 
>>>>> flushing
>>>>> takes less than 10 seconds.
>>>>>
>>>>> INFO 01:36:44,906 Enqueuing flush of
>>>>> Memtable-ArchiveFiles.6f776e65724944@1225856921(1619949 bytes, 34467
>>>>> operations)
>>>>> INFO 01:37:47,375 Completed flushing
>>>>> C:\Storage\data\Storage\ArchiveFiles.68617368-f-3-Data.db (5549187 
>>>>> bytes)
>>>>> INFO 01:37:47,375 Writing
>>>>> Memtable-ArchiveFiles.6e616d65@978152661(1619949
>>>>> bytes, 34467 operations)
>>>>> INFO 01:37:47,375 Enqueuing flush of
>>>>> Memtable-ArchiveFiles.706172656e74466f6c6465724944@2097700961(1619949
>>>>> bytes,
>>>>> 34467 operations)
>>>>> INFO 01:38:51,343 Completed flushing
>>>>> C:\Storage\data\Storage\ArchiveFiles.6e616d65-f-3-Data.db (3820265 
>>>>> bytes)
>>>>> INFO 01:38:51,343 Writing
>>>>> Memtable-ArchiveFiles.6f776e65724944@1225856921(1619949 bytes, 34467
>>>>> operations)
>>>>>
>>>>
>>>
>>
>>
>
>


Mime
View raw message