hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-3004) Implement Recovery Mode
Date Wed, 28 Mar 2012 06:02:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-3004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13240215#comment-13240215
] 

Todd Lipcon commented on HDFS-3004:
-----------------------------------

{code}
+    } catch (IOException e) {
+      return null;
{code}
This shows up a few places and makes me nervous. We should always at least LOG.warn an exception.

----
{code}
+            if (op.getTransactionId() > expectedTxId) { 
+              askOperator("There appears to be a gap in the edit log.  " +
+                  "We expected txid " + expectedTxId + ", but got txid " +
+                  op.getTransactionId() + ".", recovery, "skipping bad edit");
+            } else if (op.getTransactionId() < expectedTxId) { 
+              askOperator("There appears to be an out-of-order edit in " +
+                  "the edit log.  We expected txid " + expectedTxId +
+                  ", but got txid " + op.getTransactionId() + ".", recovery,
+                  "applying edits");
+              continue;
{code}
Am I reading this wrong, or is the logic backwards? It seems like the first prompt allows
you to say "continue, skipping bad edit", but doesn't call continue (thus applying the bad
edit). The second one gives you the choice "continue, applying edits" but *does* skip the
bad one.

----
{code}
+          catch (Throwable e) {
+            askOperator("Failed to apply edit log operation " +
+              op.getTransactionIdStr() + ": error " + e.getMessage(),
+              recovery, "applying edits");
{code}
Here we no longer output the stack trace of the exception anywhere. That's often a big problem
- we really need the exception trace to understand the root cause and to know whether we might
be able to skip it.

- I would also recommend printing the op.toString() itself in all of these cases, not just
the txids. That gives the operator something to base their decision on. I think we have a
reasonable stringification of all ops nowdays.

----
{code}
-          LOG.info(String.format("Log begins at txid %d, but requested start "
-              + "txid is %d. Skipping %d edits.", elf.getFirstTxId(), fromTxId,
-              transactionsToSkip));
-          elfis.skipTransactions(transactionsToSkip);
+        // TODO: add recovery mode override here as part of edit log failover
+        if (elfis.skipUntil(fromTxId) == false) {
{code}
Please leave the log message in place here. Also, in the failure log message here, I think
you should (a) throw EditLogInputException, and (b) include the file path here


----
{code}
+          /* Now that the operation has been successfully decoded and
+           * applied, update our bookkeeping. */
{code}
style: use // comments inline in code, not /* */ generally. There are a couple other places
where you should make the same fix.

----
- There are still a few spurious whitespace changes unrelated to your code here. Not a big
deal, but best to remove those hunks from your patch.

----

{code}
+   * Get the next valid operation from the stream storage
+   * 
+   * This is exactly like nextOp, except that we attempt to skip over damaged
+   * parts of the edit log
{code}
Please add '.'s at the end of the sentences in the docs. It's annoying, but the way JavaDoc
gets formatted, all the sentences will run together without line breaks in many cases, so
the punctuation's actually important for readability.
----

{code}
+   * @return        Returns true if we found a transaction ID greater than
+   *                'txid' in the log.
+   */
+  public boolean skipUntil(long txid) throws IOException {
{code}
The javadoc should say "greater than or equal to", right? Also, I think the doc should specify
explicitly: "the next call to readOp() will usually return the transaction with the specified
txid, unless the log contains a gap, in which case it may return a higher one."

----
{code}
+  private ThreadLocal <OpInstanceCache> cache = new ThreadLocal<OpInstanceCache>()
{
{code}
Style: no space between "ThreadLocal" and '<'. Also the line's a bit long (we try to keep
to 80 characters unless it's really hard not to)

----
{code}
+            if (op == null)
               break;
{code}
Style: we use {} braces even for one-line if statements. There are a few other examples of
this too.

{code}
+          }
+          catch (Throwable e) {
{code}
catch clause goes on same line as '}'

----
{code}
+    if (txid == HdfsConstants.INVALID_TXID) {
+      throw new RuntimeException("operation has no txid");
{code}
Use {{Preconditions.checkState()}} here for better readability

----
{code}
+        LOG.error("automatically choosing " + firstChoice);
{code}
this should probably be INFO or WARN
----

Style: please add a blank line between functions in RecoveryContext.java

----
TestNameNodeRecovery:
- do we need data nodes in these tests? if it's just metadata ops, 0 DNs should be fine
- move the EditLogTestSetup interface down lower in the file, just above where you define
the implementations
- extra space between {{interface}} and {{EditLogTestSetup}}
- please add blank lines between functions in that interface, and expand the javadoc out so
the {{/**}} line is on its own, like the rest of the codebase
- removeFromArray -- really this is {{markTxnIdFound}} where {{-1}} is used as a mark, right?
I think it might be much clearer to just use Set<Integer> here and remove the ints from
the set -- that's essentially what you're doing?
- some funky indentation in a few places
- rather than catching, logging, and calling fail(), you can just let the Throwable pass out
of the test case - that'll also fail the test case, with the advantage that the stack trace
of the failure becomes clickable in IDEs
- tip: you can use StringUtils.stirngifyException to get an exception's stack trace into a
String without StringWriter
- instead of the if (stream != null) checks, you can use IOUtils.closeStream, or IOUtils.cleanup,
which takes care of that for you

                
> Implement Recovery Mode
> -----------------------
>
>                 Key: HDFS-3004
>                 URL: https://issues.apache.org/jira/browse/HDFS-3004
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: tools
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>         Attachments: HDFS-3004.010.patch, HDFS-3004.011.patch, HDFS-3004.012.patch, HDFS-3004.013.patch,
HDFS-3004.015.patch, HDFS-3004.016.patch, HDFS-3004.017.patch, HDFS-3004.018.patch, HDFS-3004.019.patch,
HDFS-3004.020.patch, HDFS-3004.022.patch, HDFS-3004.023.patch, HDFS-3004.024.patch, HDFS-3004.026.patch,
HDFS-3004.027.patch, HDFS-3004.029.patch, HDFS-3004.030.patch, HDFS-3004.031.patch, HDFS-3004.032.patch,
HDFS-3004.033.patch, HDFS-3004__namenode_recovery_tool.txt
>
>
> When the NameNode metadata is corrupt for some reason, we want to be able to fix it.
 Obviously, we would prefer never to get in this case.  In a perfect world, we never would.
 However, bad data on disk can happen from time to time, because of hardware errors or misconfigurations.
 In the past we have had to correct it manually, which is time-consuming and which can result
in downtime.
> Recovery mode is initialized by the system administrator.  When the NameNode starts up
in Recovery Mode, it will try to load the FSImage file, apply all the edits from the edits
log, and then write out a new image.  Then it will shut down.
> Unlike in the normal startup process, the recovery mode startup process will be interactive.
 When the NameNode finds something that is inconsistent, it will prompt the operator as to
what it should do.   The operator can also choose to take the first option for all prompts
by starting up with the '-f' flag, or typing 'a' at one of the prompts.
> I have reused as much code as possible from the NameNode in this tool.  Hopefully, the
effort that was spent developing this will also make the NameNode editLog and image processing
even more robust than it already is.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message