cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Garvit Juniwal (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-12728) Handling partially written hint files
Date Thu, 23 Mar 2017 18:05:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15938918#comment-15938918
] 

Garvit Juniwal commented on CASSANDRA-12728:
--------------------------------------------

[~jjirsa] In my patch, the only exception that is ignored is EOF error. So there is no possibility
of missing more hints by ignoring this error. From my cursory reading of https://github.com/apache/cassandra/blob/cassandra-3.9/src/java/org/apache/cassandra/db/commitlog/CommitLogReplayer.java#L390-L413,
seems like we are ignoring errors due to incomplete flushes (quoting: "Ignoring commit log
replay error likely due to incomplete flush to disk") without caring about any operator policy,
which is the right thing to do IMO and that is what I am trying to achieve in the patch as
well. Lmk if I have misunderstood something.

> Handling partially written hint files
> -------------------------------------
>
>                 Key: CASSANDRA-12728
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-12728
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Sharvanath Pathak
>              Labels: lhf
>         Attachments: CASSANDRA-12728.patch
>
>
> {noformat}
> ERROR [HintsDispatcher:1] 2016-09-28 17:44:43,397 HintsDispatchExecutor.java:225 - Failed
to dispatch hints file d5d7257c-9f81-49b2-8633-6f9bda6e3dea-1474892654160-1.hints: file is
corrupted ({})
> org.apache.cassandra.io.FSReadError: java.io.EOFException
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:282)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:252)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatcher.sendHints(HintsDispatcher.java:156)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatcher.sendHintsAndAwait(HintsDispatcher.java:137)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:119)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatcher.dispatch(HintsDispatcher.java:91)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.deliver(HintsDispatchExecutor.java:259)
[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:242)
[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.dispatch(HintsDispatchExecutor.java:220)
[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsDispatchExecutor$DispatchHintsTask.run(HintsDispatchExecutor.java:199)
[apache-cassandra-3.0.6.jar:3.0.6]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_77]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_77]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[na:1.8.0_77]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[na:1.8.0_77]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_77]
> Caused by: java.io.EOFException: null
>         at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:68)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.ChecksummedDataInput.readFully(ChecksummedDataInput.java:126)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402) ~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.readBuffer(HintsReader.java:310)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNextInternal(HintsReader.java:301)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         at org.apache.cassandra.hints.HintsReader$BuffersIterator.computeNext(HintsReader.java:278)
~[apache-cassandra-3.0.6.jar:3.0.6]
>         ... 15 common frames omitted
> {noformat}
> We've found out that the hint file was truncated because there was a hard reboot around
the time of last write to the file. I think we basically need to handle partially written
hint files. Also, the CRC file does not exist in this case (probably because it crashed while
writing the hints file). May be ignoring and cleaning up such partially written hint files
can be a way to fix this?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message