cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Jirsa (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-11995) Commitlog replaced with all NULs
Date Fri, 17 Mar 2017 21:26:41 GMT


Jeff Jirsa updated CASSANDRA-11995:
    Status: Patch Available  (was: In Progress)

|| branch || utests || dtests ||
| [3.0|] | [testall|]
| [dtest|] |
| [3.11|] | [testall|]
| [dtest|] |
| [trunk|] | [testall|]
| [dtest|] |

Note to reviewer: [~aweisberg] and I talked about this offline a bit, and one of the things
worth questioning is "how do we even get in this position". It seems like there may be a window
after [CommitLogDescriptor.writeHeader|]
is called where we don't actually sync, but short of a system reboot, we should still have
the data in memory and the kernel should keep it consistent - however, if we're crashing for
some other reason, we could certainly have an all-0 file, which will fail to replay. We may
want to open up a subsequent JIRA to talk address that particular problem, but we see it as
distinct from the replay problem. 

This patch, then, is only dealing with the problem of replaying the final all-0 file, which
we consider to be a change in behavior from 2.x. Continuing to replay a "corrupt" all-null
file is the 2.x behavior, and presumably should only allowed if we're the last segment, which
we already explicitly tolerate in the rest of that segment via {{tolerateTruncation}} flag
- this patch just makes {{tolerateTruncation}} also tolerate truncation of the header without
interrupting replay and startup.

> Commitlog replaced with all NULs
> --------------------------------
>                 Key: CASSANDRA-11995
>                 URL:
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Core
>         Environment: Windows 10 Enterprise 1511
> DataStax Cassandra Community Server 2.2.3
>            Reporter: James Howe
>            Assignee: Jeff Jirsa
> I noticed this morning that Cassandra was failing to start, after being shut down on
> {code}
> ERROR 09:13:37 Exiting due to error while processing commit log during initialization.
> org.apache.cassandra.db.commitlog.CommitLogReplayer$CommitLogReplayException: Could not
read commit log descriptor in file C:\Program Files\DataStax Community\data\commitlog\CommitLog-5-1465571056722.log
> 	at org.apache.cassandra.db.commitlog.CommitLogReplayer.handleReplayError(
> 	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(
> 	at org.apache.cassandra.db.commitlog.CommitLogReplayer.recover(
> 	at org.apache.cassandra.db.commitlog.CommitLog.recover( [apache-cassandra-2.2.3.jar:2.2.3]
> 	at org.apache.cassandra.db.commitlog.CommitLog.recover( [apache-cassandra-2.2.3.jar:2.2.3]
> 	at org.apache.cassandra.service.CassandraDaemon.setup( [apache-cassandra-2.2.3.jar:2.2.3]
> 	at org.apache.cassandra.service.CassandraDaemon.activate( [apache-cassandra-2.2.3.jar:2.2.3]
> 	at org.apache.cassandra.service.CassandraDaemon.main( [apache-cassandra-2.2.3.jar:2.2.3]
> {code}
> Checking the referenced file reveals it comprises 33,554,432 (32 * 1024 * 1024) NUL bytes.
> No logs (stdout, stderr, prunsrv) from the shutdown show any other issues and appear
exactly as normal.
> Is installed as a service via DataStax's distribution.

This message was sent by Atlassian JIRA

View raw message