incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: 0.7.2 slow memtables flushing
Date Tue, 22 Feb 2011 18:24:04 GMT
Absolutely right.  (So, it's really a write-time slowdown, not read-time.)

Created https://issues.apache.org/jira/browse/CASSANDRA-2218 for the fix.

Thanks a lot for tracking that down!

2011/2/22 Ivan Georgiev <yngwiie@bk.ru>:
> Hi, yes, you are absolutely right, i overlooked that.
> I am sending directly as i dont want to polute the mailing list with my
> guesses.
>
> Problem is my dataset is exactly the same for 0.7.0 and 0.7.2 and my rows
> are not large at all.
> So in all reality, rebuffer() should read only when a row is on a boundary
> and crosses across buffers.
> And the rows size should easily fit more than 15 rows in a buffer. So i am
> sure that this flushing of 50mb file taking more than a minute
> means something is not right.
>
> I did a single change in BRAF.reBuffer() but i am not sure if it is correct.
> Basically changed:
>
>  if (bufferOffset > channel.size())
>        {
>            buffer.rewind();
>            bufferEnd = bufferOffset;
>            hitEOF = true;
>
>            return 0;
>  }
>
> to
>
>  if (bufferOffset >= channel.size())
>        {
>            buffer.rewind();
>            bufferEnd = bufferOffset;
>            hitEOF = true;
>
>            return 0;
>  }
>
> This gives me the same performance as in 0.7.0. The flushes in 0.7.2 now
> take about 6 seconds
> which is consistent with 0.7.0.
>
> I did that change, because i noticed that bytesRead was returning multiple
> times with -1 at the end of the method, which meant
> it had hit the end of file even on firs read. So i augmented the test to
> include that case and not do unnecessary reads when the buffer
> is at the end of file (looks like those reads were the expensive ones).
> Please let me know if that is correct or it breaks something ?
> Again, sorry if i am wasting your time.
>
> Thanks in advance.
>
> Regards:
> Ivan
>
> On 21.2.2011 г. 22:33 ч., Jonathan Ellis wrote:
>>
>> If you look in that code, the bounds are checked on each write and
>> reBuffer is called from there instead of from seek
>>
>> On Mon, Feb 21, 2011 at 2:21 PM, Ivan Georgiev<yngwiie@bk.ru>  wrote:
>>>
>>> I meant what was tagged as 0.7.0, at least that is what i used in my
>>> 0.7.0
>>> tests:
>>>
>>> http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.7.0/
>>>
>>> Ivan
>>>
>>> On 21.2.2011 г. 22:12 ч., ruslan usifov wrote:
>>>
>>> 2011/2/21 Ivan Georgiev<yngwiie@bk.ru>
>>>>
>>>> That is strange. In 0.7.0 i see this for seek:
>>>>
>>>> public void seek(long pos) throws IOException
>>>> {
>>>> this.curr_ = pos;
>>>> }
>>>
>>> You doesn't see 0.7.0 version, you see version before
>>> cassandra/branches/cassandra-0.7@1052531 (2010-12-24 16:57:07 +0000 (8
>>> weeks
>>> ago))
>>>
>>>
>>
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Mime
View raw message