hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chao Sun (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13898) Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode
Date Thu, 20 Sep 2018 16:52:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622334#comment-16622334
] 

Chao Sun commented on HDFS-13898:
---------------------------------

Thanks [~xkrogen] for the explanation. I like the idea of having some particular exception
to trigger ORPP to directly go to active - this could be useful for cases like HDFS-13924.
For this particular scenario (observer in safemode) though, I think it's fine since I assume
normally safemode only happens when the observer is starting up, which should not be quite
common. Also, the safemode could last for quite a while and in my experience the chance of
RPCs hitting this error is quite high, so might better to have all clients to re-direct to
a different observer anyway.

Regarding the v002 patch, how about change the test name to {{testObserverNodeSafeModeWithBlockLocations}}?
{{testObserverNodeSafeModeWithoutBlockLocations}} seems a little confusing to me since we
are testing the safe mode case with {{getBlockLocations}} calls. About the {{HAState}} change,
no particular reason except I wanted to make the lines shorter :) I'm perfectly fine to change
it back.

 

Will fix the style issues too.

> Throw retriable exception for getBlockLocations when ObserverNameNode is in safemode
> ------------------------------------------------------------------------------------
>
>                 Key: HDFS-13898
>                 URL: https://issues.apache.org/jira/browse/HDFS-13898
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>            Priority: Major
>         Attachments: HDFS-13898-HDFS-12943.000.patch, HDFS-13898-HDFS-12943.001.patch,
HDFS-13898-HDFS-12943.002.patch
>
>
> When ObserverNameNode is in safe mode, {{getBlockLocations}} may throw safe mode exception
if the given file doesn't have any block yet. 
> {code}
>     try {
>       checkOperation(OperationCategory.READ);
>       res = FSDirStatAndListingOp.getBlockLocations(
>           dir, pc, srcArg, offset, length, true);
>       if (isInSafeMode()) {
>         for (LocatedBlock b : res.blocks.getLocatedBlocks()) {
>           // if safemode & no block locations yet then throw safemodeException
>           if ((b.getLocations() == null) || (b.getLocations().length == 0)) {
>             SafeModeException se = newSafemodeException(
>                 "Zero blocklocations for " + srcArg);
>             if (haEnabled && haContext != null &&
>                 haContext.getState().getServiceState() == HAServiceState.ACTIVE) {
>               throw new RetriableException(se);
>             } else {
>               throw se;
>             }
>           }
>         }
>       }
> {code}
> It only throws {{RetriableException}} for active NN so requests on observer may just
fail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message