hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daryn Sharp (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11313) Segmented Block Reports
Date Fri, 13 Jan 2017 15:54:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15821924#comment-15821924

Daryn Sharp commented on HDFS-11313:

Interesting idea.  Need to think through race conditions because the current naive design
(snapshot in time), is easy to reconcile state in the NN.  Not saying I like it, just that
we need to think hard about new races esp. with IBRs.

It must include provisions for negative block ids so it's not just the last segment that is
open ended.  Not a contrived use case, we have many 2.x clusters with legacy negative block
ids esp. archival clusters.

What would be the basic design?  Is it predicated on the NN sorting block ids?  If yes, I
have strong concerns I'll outline.  How are the segment ranges computed?  Fixed size?  How
will very sparse block ranges be handled, esp. in the case of negative block ids?

What I've long wanted to do is invert the block report processing.  The NN sends BRs to the
DN, and DN reconciles inconsistencies with IBRs.  Haven't thought through it beyond the concept,
but I digress.

> Segmented Block Reports
> -----------------------
>                 Key: HDFS-11313
>                 URL: https://issues.apache.org/jira/browse/HDFS-11313
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, namenode
>    Affects Versions: 2.6.2
>            Reporter: Konstantin Shvachko
> Block reports from a single DataNode can be currently split into multiple RPCs each reporting
a single DataNode storage (disk). The reports are still large since disks are getting bigger.
Splitting blockReport RPCs into multiple smaller calls would improve NameNode performance
and overall HDFS stability.
> This was discussed in multiple jiras. Here the approach is to let NameNode divide blockID
space into segments and then ask DataNodes to report replicas in a particular range of IDs.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message