hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-4047) BPServiceActor has nested shouldRun loops
Date Fri, 16 Nov 2012 21:45:13 GMT

    [ https://issues.apache.org/jira/browse/HDFS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13499163#comment-13499163
] 

Eli Collins commented on HDFS-4047:
-----------------------------------

That's correct, I captured that in the comments in the patch but not in the jira - sorry -
I should have called that out explicitly here.

{code}
-   * No matter what kind of exception we get, keep retrying to offerService().
-   * That's the loop that connects to the NameNode and provides basic DataNode
-   * functionality.
...
+   * Main loop for each BP thread. It retries on IOExceptions, only
+   * stops when "shouldRun" or "shouldServiceRun" are false, ie
+   * on shutdown or refreshNamenodes (or non-IOE).
{code}

My thinking from HDFS-2882 and HDFS-4201 is that we shouldn't soldier on in the case of an
RTE, eg NPE due to a BP failing to initialize, as this likely indicates a host configuration
error. I could also see the point of view that the DN shouldn't stop running because one BP
failed because perhaps the other is alive and well. What do you think?

                
> BPServiceActor has nested shouldRun loops
> -----------------------------------------
>
>                 Key: HDFS-4047
>                 URL: https://issues.apache.org/jira/browse/HDFS-4047
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.0-alpha
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>            Priority: Minor
>         Attachments: HADOOP-4047.patch, HDFS-4047.patch, hdfs-4047.txt, hdfs-4047.txt
>
>
> BPServiceActor#run and offerService booth have while shouldRun loops. We only need the
outer one, ie we can hoist the info log from offerService out to run and remove the while
loop.
> {code}
> BPServiceActor#run:
> while (shouldRun()) {
>   try {
>     offerService();
>   } catch (Exception ex) {
> ...
> offerService:
> while (shouldRun()) {
>   try {
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message