hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HDFS-9543) DiskBalancer : Add Data mover
Date Thu, 28 Apr 2016 18:51:13 GMT

     [ https://issues.apache.org/jira/browse/HDFS-9543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Anu Engineer updated HDFS-9543:
    Attachment: HDFS-9543-HDFS-1312.003.patch

bq. In copyBlocks, the openPoolIters call should be inside the while loop, just before you
open the try block.

Just wanted to make sure that I understand this point clearly before I do any fixes. copyBlock
takes a source volume and destination volume. openPoolIters open all block pools on the source
volume, and then we try to fetch a block from those pools in a round-robin fashion.

I do agree that openPoolIters should be protected by a try -- and then {{finally}} should
call into closePoolIters. In fact, I wrote it that way, but I thought the nesting of code
was getting out of hand. if you are saying that I need to switch to having a try before openPoolIters,
I will fix that in the next update.

Moving it down into the while loop might be inefficient. In the sense that we will have to
close the iterators and reopen them between each block fetch.

bq. I didn't understand the new computeDelay calculation. Your original calculation (v1 patch)
looked correct, all that remained was to subtract timeUsed and check for negative result.

That is exactly what I was attempting to do. Here are the what the old and new calculations
do. In the old calculation we look at a copy, let us say a 50 MB block was moved, and say
“50 MB moved, we are supposed to do 10 MB/s so let us sleep for 5 seconds”. Which as you
pointed was clearly wrong since it was not taking care of the time spend doing the copy. The
new calculation looks at “we copied 50 MB in 3 seconds, we are supposed to copy 50 MB in
5 seconds, so let us subtract 5 - 3, and now sleeps for 2 seconds”.

bq. Thanks for the pointer to DirectoryScanner#scan. The synchronization is needed only when
making changes to the in-memory block map.  
Fixed, thanks for explaining this to me. I have removed the synchronization.

bq. We can also log the maximum error count value here for quick reference. Also a typo cound
-> count.
Fixed both issues.

bq. I think the -blockpools flag just restricts the balancing to a subset of blockpools.

> DiskBalancer : Add Data mover 
> ------------------------------
>                 Key: HDFS-9543
>                 URL: https://issues.apache.org/jira/browse/HDFS-9543
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: HDFS-9543-HDFS-1312.001.patch, HDFS-9543-HDFS-1312.002.patch, HDFS-9543-HDFS-1312.003.patch
> This patch adds the actual mover logic to the datanode.

This message was sent by Atlassian JIRA

View raw message