hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4451) hdfs balancer exits 1 on success, it should exit 0
Date Wed, 30 Jan 2013 07:13:14 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566244#comment-13566244
] 

Chris Nauroth commented on HDFS-4451:
-------------------------------------

Thanks, Joshua.  I applied the patch to trunk and ran these test suites successfully: TestBalancer,
TestBalancerWithEncryptedTransfer, TestBalancerWithHANameNodes, TestBalancerWithMultipleNameNodes,
TestBalancerWithNodeGroup, TestBalancerBandwidth.  Here are a few comments.

1. It appears that it is not possible for the tool to exit with IN_PROGRESS.  We have this
code in the static {{Balancer#run}} method:

{code}
boolean done = false;
for(int iteration = 0; !done; iteration++) {
  done = true;
  Collections.shuffle(connectors);
  for(NameNodeConnector nnc : connectors) {
    final Balancer b = new Balancer(nnc, p, conf);
    final ReturnStatus r = b.run(iteration, formatter, conf);
    // clean all lists
    b.resetData(conf);
    if (r == ReturnStatus.IN_PROGRESS) {
      done = false;
    } else if (r != ReturnStatus.SUCCESS) {
      //must be an error statue, return.
      return r.code;
    }
  }

  if (!done) {
    Thread.sleep(sleeptime);
  }
}
{code}

This loop will not terminate while return status is IN_PROGRESS, so I expect we would never
exit the CLI with IN_PROGRESS.  If that's the case, then we wouldn't need to map IN_PROGRESS
to a successful exit, and {{toExitCode}} could simplify to something like code == 1 ? 0 :
1.

2. Actually, I wonder if the simpler change is just to swap the codes in the enum, so that
it's SUCCESS(0) and IN_PROGRESS(1).  Then, we wouldn't need {{toExitCode}}.  If callers never
really see the IN_PROGRESS exit code (for the reasons stated above) and if we're already declaring
this change backwards-incompatible, then this would be simpler.

3. Do you want to add a test that calls {{twoNodeTest(conf, true)}}?  I see that the {{useTool}}
flag has been introduced in the method signature, but I didn't find any tests that set the
flag to true.

{code}
-  private void twoNodeTest(Configuration conf) throws Exception {
+  private void twoNodeTest(Configuration conf, boolean useTool) throws Exception {
{code}

                
> hdfs balancer exits 1 on success, it should exit 0
> --------------------------------------------------
>
>                 Key: HDFS-4451
>                 URL: https://issues.apache.org/jira/browse/HDFS-4451
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.0.2-alpha
>         Environment: Centos 6.3, JDK 1.6.0_25
>            Reporter: Joshua Blatt
>         Attachments: HDFS-4451.patch
>
>
> Though the org.apache.hadoop.util.Tool interface javadocs indicate  implementations should
return 0 on success, the org.apache.hadoop.hdfs.server.balance.Balancer.Cli implementation
returns the int values of this enum instead:
>   // Exit status
>   enum ReturnStatus {
>     SUCCESS(1),
>     IN_PROGRESS(0),
>     ALREADY_RUNNING(-1),
>     NO_MOVE_BLOCK(-2),
>     NO_MOVE_PROGRESS(-3),
>     IO_EXCEPTION(-4),
>     ILLEGAL_ARGS(-5),
>     INTERRUPTED(-6);
> This created an issue for us when we tried to run the hdfs balancer as a cron job.  
Cron sends emails whenever a executable it runs exits non-zero.   We'd either have to disable
all emails and miss real issues or fix this bug.
> I think both SUCCESS and IN_PROGRESS ReturnStatuses should lead to exit 0.
> Marking this change as incompatible because existing scripts which interpret exit 1 as
success will be broken (unless they defensively/liberally interpret both exit 1 and exit 0
as success).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message