cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10159) Incorrect last update time causes dtest to fail due to unexpected errors
Date Tue, 25 Aug 2015 07:52:45 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14710821#comment-14710821
] 

Stefania commented on CASSANDRA-10159:
--------------------------------------

I've attached the log file of the node that caused the problem on Jenkins. In this specific
case, it failed to complete the transaction when restarting because of checking the last update
time even when no files exist, which is what we fixed here. This highlights another problem
however, if for any reason we cannot complete a transaction, chances are we won't be able
to list temporary files for this table either, regardless of the number of attempts. So my
idea to increase MAX_ATTEMPTS is actually not required. This observation increases the importance
of CASSANDRA-10112. If we decide to carry on, corrupt log files should be stashed or removed.

> Incorrect last update time causes dtest to fail due to unexpected errors
> ------------------------------------------------------------------------
>
>                 Key: CASSANDRA-10159
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10159
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Stefania
>            Assignee: Stefania
>             Fix For: 3.0.0 rc1
>
>         Attachments: node2.log
>
>
> Some dtests are failing as follows:
> http://cassci.datastax.com/job/cassandra-3.0_dtest/96/testReport/counter_tests/TestCounters/upgrade_test/
> {code}
> Unexpected error in node2 node log: ['ERROR [main] 2015-08-23 11:25:52,701 TransactionLog.java:246
- Possible disk corruption detected for sstable [ma-2-big], record [REMOVE:[ma-2-big,1440329048000,8]]:
last update time [Thu Jan 01 00:00:00 UTC 1970] should have been [Sun Aug 23 11:24:08 UTC
2015] ERROR [main] 2015-08-23 11:25:52,709 TransactionLog.java:992 - Possible disk corruption:
failed to read transaction log /mnt/tmp/dtest-E0OvQC/test/node2/data/system/local-7ad54392bcdd35a684174e047860b377/ma_txn_compaction_90eda9f0-4989-11e5-86bd-f32569933441.log
org.apache.cassandra.db.lifecycle.TransactionLog$CorruptTransactionLogException: Failed to
verify transaction 90eda9f0-4989-11e5-86bd-f32569933441 record [REMOVE:[ma-2-big,1440329048000,8]]:
possible disk corruption, aborting \tat org.apache.cassandra.db.lifecycle.TransactionLog$TransactionFile.readRecords(TransactionLog.java:349)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.TransactionLog$TransactionData.readLogFile(TransactionLog.java:574)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.TransactionLog.removeUnfinishedLeftovers(TransactionLog.java:988)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.LifecycleTransaction.removeUnfinishedLeftovers(LifecycleTransaction.java:548)
[main/:na] \tat org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:584)
[main/:na] \tat org.apache.cassandra.service.StartupChecks$7.execute(StartupChecks.java:274)
[main/:na] \tat org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:103)
[main/:na] \tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:166)
[main/:na] \tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
[main/:na] \tat org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:622)
[main/:na] ERROR [main] 2015-08-23 11:25:52,710 TransactionLog.java:998 - Failed to remove
unfinished transaction leftovers org.apache.cassandra.db.lifecycle.TransactionLog$CorruptTransactionLogException:
Failed to verify transaction 90eda9f0-4989-11e5-86bd-f32569933441 record [REMOVE:[ma-2-big,1440329048000,8]]:
possible disk corruption, aborting \tat org.apache.cassandra.db.lifecycle.TransactionLog$TransactionFile.readRecords(TransactionLog.java:349)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.TransactionLog$TransactionData.readLogFile(TransactionLog.java:574)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.TransactionLog.removeUnfinishedLeftovers(TransactionLog.java:988)
~[main/:na] \tat org.apache.cassandra.db.lifecycle.LifecycleTransaction.removeUnfinishedLeftovers(LifecycleTransaction.java:548)
[main/:na] \tat org.apache.cassandra.db.ColumnFamilyStore.scrubDataDirectories(ColumnFamilyStore.java:584)
[main/:na] \tat org.apache.cassandra.service.StartupChecks$7.execute(StartupChecks.java:274)
[main/:na] \tat org.apache.cassandra.service.StartupChecks.verify(StartupChecks.java:103)
[main/:na] \tat org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:166)
[main/:na] \tat org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:516)
[main/:na] \tat 
> {code}
> My best guess is that before reading the update time we should check that the file actually
exists.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message