cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew F. Dennis (JIRA)" <j...@apache.org>
Subject [jira] Updated: (CASSANDRA-1179) split commitlog into header + mutations files
Date Wed, 16 Jun 2010 19:20:26 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-1179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthew F. Dennis updated CASSANDRA-1179:
-----------------------------------------

    Attachment: trunk-1179-v4.txt

{quote}
made some minor changes, primarily using BRAF in writeCommitLogHeader (you don't get buffering
w/ raw FileOutputStream, and BRAF is simpler than doing the FOS/BufferedOutputStream/FileChannel
dance).
{quote}

FOS doesn't sync on flush/close and as headers are "optional" now there is no reason to waste
the IO.  Just to be sure I was remembering this correctly, I just now tested it.  It provides
80+% improvement over BRAF, even more on a heavily loaded system.  This was clearly a failure
on my part to document it at as such.  The header is so small (56 bytes I think) the OS will
cache it just fine and not using buffered output will avoid both the memcopies and GC from
the buffers.

{quote}
todo: still needs to delete the .headers after a successful replay as well as the .log.
{quote}

thank you, I hadn't realized there were two places the logs were getting removed.  Done.

{quote}
I'd rather fix BRAF to generate correct EOFExceptions in case other code runs into this. (And
by removing the EOFException check, we introduce a new bug that if the size int is incomplete,
we die again.)

(actually BRAF.read should be returning -1, so that RAF.readFully throws EOFException) 
{quote}

It was not at EOF, the buffer the data was supposed to be written into was zero length.  There
was data in the file, but no where to write it in the buffer (because the size read was 0,
new byte[size] resulted in a zero length array was was then supposed to be filled by BRAF.readFully).

I've added tests to catch this problem (as well as other related ones) and also changed BRAF
to throw a more reasonable exception (but not EOF).  I believe BRAF.readFully will already
throw EOF if it is at the end of the file.

The size of the log entry is now CRCed on it's own.  Whlie testing with random garbage at
the end of a commit log, I had written a really large int to the size field which resulted
in recover() trying to allocate a massive byte[] and getting OOM.

{quote}
by removing the EOFException check, we introduce a new bug that if the size int is incomplete,
we die again.
{quote}

good catch.  I have no idea WTF I was thinking, there was even a comment that warned about
it that got removed when the try/catch was removed.  I was probably trying to test something
and removed it so it'd spew but forgot to put it back.


> split commitlog into header + mutations files
> ---------------------------------------------
>
>                 Key: CASSANDRA-1179
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-1179
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Jonathan Ellis
>            Assignee: Matthew F. Dennis
>             Fix For: 0.7
>
>         Attachments: 1179-v2.txt, trunk-1179-v3.txt, trunk-1179-v4.txt, trunk-1179.txt
>
>
> As mentioned in CASSANDRA-1119, it seems possible that a commitlog header could be corrupted
by a power loss during update of the header, post-flush.  We could try to make it more robust
(by writing the size of the commitlogheader first, and skipping to the end if we encounter
corruption) but it seems to me that the most foolproof method would be to split the log into
two files: the header, which we'll overwrite, and the data, which is truly append only.  If
If the header is corrupt on reply, we just reply the data from the beginning; the header allows
us to avoid replaying data redundantly, but it's strictly an optimization and not required
for correctness.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message