hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-11620) Record the class name of Writer in WAL header so that only proper Reader can open the WAL file
Date Fri, 01 Aug 2014 16:53:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-11620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082486#comment-14082486
] 

Andrew Purtell commented on HBASE-11620:
----------------------------------------

+1, patch v6

Please update or remove the comments in testSecureHLogReaderOnHLog on commit, and fix the
assert messages. The meaning of all the checks are reversed, but the text hasn't been updated
to reflect that.


> Record the class name of Writer in WAL header so that only proper Reader can open the
WAL file
> ----------------------------------------------------------------------------------------------
>
>                 Key: HBASE-11620
>                 URL: https://issues.apache.org/jira/browse/HBASE-11620
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.98.4
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.99.0, 0.98.5, 2.0.0
>
>         Attachments: 11620-0.98-v6.txt, 11620-v1.txt, 11620-v2.txt, 11620-v3.txt, 11620-v4.txt,
11620-v5.txt, 11620-v6.txt, 11620-v6.txt
>
>
> Reported by Kiran in this thread: "HBase file encryption, inconsistencies observed and
data loss"
> After step 4 ( i.e disabling of WAL encryption, removing SecureProtobufReader/Writer
and restart), read of encrypted WAL fails mainly due to EOF exception at Basedecoder. This
is not considered as error and these WAL are being moved to /oldWALs.
> Following is observed in log files:
> {code}
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Splitting
hlog: hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017,
length=172
> 2014-07-30 19:44:29,254 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: DistributedLogReplay
= false
> 2014-07-30 19:44:29,313 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: Recovering
lease on dfs file hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
> 2014-07-30 19:44:29,315 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] util.FSHDFSUtils: recoverLease=true,
attempt=0 on file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
after 1ms
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0] wal.HLogSplitter:
Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-0,5,main]: starting
> 2014-07-30 19:44:29,429 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1] wal.HLogSplitter:
Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-1,5,main]: starting
> 2014-07-30 19:44:29,430 DEBUG [RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2] wal.HLogSplitter:
Writer thread Thread[RS_LOG_REPLAY_OPS-HOST-16:15264-1-Writer-2,5,main]: starting
> 2014-07-30 19:44:29,591 ERROR [RS_LOG_REPLAY_OPS-HOST-16:15264-1] codec.BaseDecoder:
Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Finishing
writing output logs and closing down.
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Waiting
for split writer threads to finish
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Split
writers finished
> 2014-07-30 19:44:29,592 INFO  [RS_LOG_REPLAY_OPS-HOST-16:15264-1] wal.HLogSplitter: Processed
0 edits across 0 regions; log file=hdfs://HOST-16:18020/hbase/WALs/HOST-16,15264,1406725441997-splitting/HOST-16%2C15264%2C1406725441997.1406725444017
is corrupted = false progress failed = false
> {code}
> To fix this, we need to propagate EOF exception to HLogSplitter. Any suggestions on the
fix?
> -------- (end of quote from Kiran)
> In BaseDecoder#rethrowEofException() :
> {code}
>     if (!isEof) throw ioEx;
>     LOG.error("Partial cell read caused by EOF: " + ioEx);
>     EOFException eofEx = new EOFException("Partial cell read");
>     eofEx.initCause(ioEx);
>     throw eofEx;
> {code}
> throwing EOFException would not propagate the "Partial cell read" condition to HLogSplitter
which doesn't treat EOFException as an error.
> I think IOException should be thrown above - HLogSplitter#getNextLogLine() would translate
the IOEx to CorruptedLogFileException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message