hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen Zhang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-14657) Refine NameSystem lock usage during processing FBR
Date Wed, 31 Jul 2019 08:32:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-14657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896917#comment-16896917

Chen Zhang commented on HDFS-14657:

Thanks [~shv], but sorry I can't see any problem of this change on 2.6 version.
{quote}I believe when you release the lock while iterating over the storage blocks, the iterator
may find itself in an isolated chain of the list after reacquiring the lock
It won't happen, because processReport don't iterate the storage blocks at 2.6, the whole
FBR procedure(for each storage) can be simplified like this:

| # Insert a delimiter into the head of block list(triplets, it's actually a double linked
list, so I'll ref it as the block list for simplification) of this storage.
 # Start a loop, iterate through block report
 ## Get a block from the report
 ## Using the block to get the stored BlockInfo object from BlockMap
 ## Check the status of the block, and add the block to corresponding set(toAdd, toUc, toInvalidate,
 ## Move the block to the head of block list(which makes the block placed before delimiter)
 # Start a loop to iterate through block list, find the blocks after delimiter, add them to
toRemove set.|

My proposal in this Jira is to release and re-acquire NN lock between 2.3 and 2.4. This solution
won't affect the correctness of block report procedure for the following reasons:
 # All the reported block will stored before delimiter in the end.
 # If any other thread acquire the NN lock before 2.4 add adds some new blocks, they will
be added in the head of list.
 # If any other thread acquire the NN lock before 2.4 and removes some blocks, it won't affect
the loop at 2nd step. (Pls notice that the delimiter can't be remove by other threads)
 # All the blocks after delimiter should be removed

According to the reasons described above, the following problem you mentioned also won't happen:
{quote}you may remove replicas that were not supposed to be removed

I agree with you that the  things are tricky here, but this change is quite simple and I
think we still can make clear the impaction.

> Refine NameSystem lock usage during processing FBR
> --------------------------------------------------
>                 Key: HDFS-14657
>                 URL: https://issues.apache.org/jira/browse/HDFS-14657
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Chen Zhang
>            Assignee: Chen Zhang
>            Priority: Major
>         Attachments: HDFS-14657-001.patch, HDFS-14657.002.patch
> The disk with 12TB capacity is very normal today, which means the FBR size is much larger
than before, Namenode holds the NameSystemLock during processing block report for each storage,
which might take quite a long time.
> On our production environment, processing large FBR usually cause a longer RPC queue
time, which impacts client latency, so we did some simple work on refining the lock usage,
which improved the p99 latency significantly.
> In our solution, BlockManager release the NameSystem write lock and request it again
for every 5000 blocks(by default) during processing FBR, with the fair lock, all the RPC request
can be processed before BlockManager re-acquire the write lock.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message