hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5496) Make replication queue initialization asynchronous
Date Tue, 10 Dec 2013 22:18:08 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13844737#comment-13844737
] 

Jing Zhao commented on HDFS-5496:
---------------------------------

Another question is that, currently we call processMisReplicateBlocks when 1) starting active
service, 2) leaving safemode, and 3) before leaving safemode if blockReplQueueThreshold is
met. Specifically, with or without HA setup, we call processMisReplicateBlocks in the following
cases:

# NN is not in safemode
# NN is in safemode, but we have not populated replication queue yet
# NN is in safemode, and we have already started populating the replication queue. We will
restart the processing here.

So for case 3, in non-HA setup, I think maybe we do not need to restart the processing since
there should not be any pending editlog for NN to process in startActiveService? In HA setup,
since we can always run processMisReplicateBlocks in startActiveService, we actually do not
need to populate replication queue while still in safemode? If we're able to make these two
changes, for the current patch, we do not need to worry about some already-running replication
initializing thread.

> Make replication queue initialization asynchronous
> --------------------------------------------------
>
>                 Key: HDFS-5496
>                 URL: https://issues.apache.org/jira/browse/HDFS-5496
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Kihwal Lee
>         Attachments: HDFS-5496.patch
>
>
> Today, initialization of replication queues blocks safe mode exit and certain HA state
transitions. For a big name space, this can take hundreds of seconds with the FSNamesystem
write lock held.  During this time, important requests (e.g. initial block reports, heartbeat,
etc) are blocked.
> The effect of delaying the initialization would be not starting replication right away,
but I think the benefit outweighs. If we make it asynchronous, the work per iteration should
be limited, so that the lock duration is capped. 
> If full/incremental block reports and any other requests that modifies block state properly
performs replication checks while the blocks are scanned and the queues populated in background,
every block will be processed. (Some may be done twice)  The replication monitor should run
even before all blocks are processed.
> This will allow namenode to exit safe mode and start serving immediately even with a
big name space. It will also reduce the HA failover latency.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Mime
View raw message