cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <j...@apache.org>
Subject [jira] Commented: (CASSANDRA-2128) Corrupted Commit logs
Date Mon, 07 Feb 2011 19:59:03 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991556#comment-12991556
] 

Jonathan Ellis commented on CASSANDRA-2128:
-------------------------------------------

0.6 uses checksums too, for this area.  (The one place it doesn't is the Header, which I posted
a patch for in CASSANDRA-1815, but that doesn't seem to actually matter in practice.)

Here's the code in question:

{code}
                    long claimedCRC32;
                    byte[] bytes;
                    try
                    {
                        bytes = new byte[(int) reader.readLong()]; // readlong can throw EOFException
too
                        reader.readFully(bytes);
                        claimedCRC32 = reader.readLong();
                    }
                    catch (EOFException e)
                    {
                        // last CL entry didn't get completely written.  that's ok.
                        break;
                    }

                    ByteArrayInputStream bufIn = new ByteArrayInputStream(bytes);
                    Checksum checksum = new CRC32();
                    checksum.update(bytes, 0, bytes.length);
                    if (claimedCRC32 != checksum.getValue())
                    {
                        // this part of the log must not have been fsynced.  probably the
rest is bad too,
                        // but just in case there is no harm in trying them.
                        continue;
                    }

                    /* deserialize the commit log entry */
                    final RowMutation rm = RowMutation.serializer().deserialize(new DataInputStream(bufIn));
{code}

In other words, first we read the mutation into a buffer and checksum it.  If it passes, we
try to deserialize that into a mutation.

It's expected that read-into-buffer can throw an EOF (which we expect and catch) but once
it passes checksum it's not expected that mutation deserialize should fail.

(Yes, checksums aren't perfect, especially not 32bit ones, but for the checksum to generate
a false positive on the last entry in the commitlog, on two different machines, is not a coincidence
I'm buying.)

At first glance, the most likely explanation is a bug in RowMutation serializer.  But it's
also possible that I'm wrong and it really is erroring out due to some unflushed data somehow.
 If you turn on debug logging, it will include the offset in the commitlog of the mutation
being replayed -- if it's very very close to the end of the file, then that would increase
that likelihood.

bq. An ideal resolution would be if EOF is hit early, log something, but don't stop the startup.
Instead process everything that we have done so far, and keep going.

Disagreed.  This is a serious enough bug that we should require human intervention before
ignoring it and starting up anyway.

> Corrupted Commit logs
> ---------------------
>
>                 Key: CASSANDRA-2128
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2128
>             Project: Cassandra
>          Issue Type: Bug
>    Affects Versions: 0.6.11
>         Environment: cassandra-0.6 @ r1064246 (0.6.11)
> Ubuntu 9.10
> Rackspace Cloud
>            Reporter: Paul Querna
>            Assignee: Gary Dusbabek
>
> Two of our nodes had a hard failure.
> They both came up with a corrupted commit log.
> On startup we get this:
> {quote}
> 011-02-07_19:34:03.95124 INFO - Finished reading /var/lib/cassandra/commitlog/CommitLog-1297099954252.log
> 2011-02-07_19:34:03.95400 ERROR - Exception encountered during startup.
> 2011-02-07_19:34:03.95403 java.io.EOFException
> 2011-02-07_19:34:03.95403 	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323)
> 2011-02-07_19:34:03.95404 	at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> 2011-02-07_19:34:03.95405 	at java.io.DataInputStream.readUTF(DataInputStream.java:547)
> 2011-02-07_19:34:03.95406 	at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:363)
> 2011-02-07_19:34:03.95407 	at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:318)
> 2011-02-07_19:34:03.95408 	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:240)
> 2011-02-07_19:34:03.95409 	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
> 2011-02-07_19:34:03.95409 	at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
> 2011-02-07_19:34:03.95410 	at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)
> 2011-02-07_19:34:03.95422 Exception encountered during startup.
> 2011-02-07_19:34:03.95436 java.io.EOFException
> 2011-02-07_19:34:03.95447 	at java.io.DataInputStream.readUnsignedShort(DataInputStream.java:323)
> 2011-02-07_19:34:03.95458 	at java.io.DataInputStream.readUTF(DataInputStream.java:572)
> 2011-02-07_19:34:03.95468 	at java.io.DataInputStream.readUTF(DataInputStream.java:547)
> 2011-02-07_19:34:03.95478 	at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:363)
> 2011-02-07_19:34:03.95489 	at org.apache.cassandra.db.RowMutationSerializer.deserialize(RowMutation.java:318)
> 2011-02-07_19:34:03.95499 	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:240)
> 2011-02-07_19:34:03.95510 	at org.apache.cassandra.db.commitlog.CommitLog.recover(CommitLog.java:172)
> 2011-02-07_19:34:03.95521 	at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:115)
> 2011-02-07_19:34:03.95531 	at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:224)
> {quote}
> On node A, the commit log in question is 100mb.
> On node B, the commit log in question is 60mb.
> An ideal resolution would be if EOF is hit early, log something, but don't stop the startup.
 Instead process everything that we have done so far, and keep going.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message