hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2936) File close()-ing hangs indefinitely if the number of live blocks does not match the minimum replication
Date Thu, 17 May 2012 18:30:10 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278067#comment-13278067

Harsh J commented on HDFS-2936:


bq. Thanks for updating the description. Are you suggesting to use dfs.namenode.replication.min
for client-side check and use dfs.namenode.replication.min.for.write for server-side check?

Sort of. The former (or whatever replaces the former) should only check file replication factor
short-values, which is applied/changed during create/setReplicationFactor alone. Not live
block count. This is still a server-side-check. Client side checks would be of no good to
an admin.

bq. BTW, "File close()-ing hangs indefinitely if the number of live blocks does not match
the minimum replication" is the original design of dfs.namenode.replication.min. I think we
should not change it.

True that that was the intention. A non-behavior changing patch can also be made (wherein
default of the for.write property will be what the original min property is). But lets at
least provide a way for admins to enforce minimum replication _factors_ on files, without
having to worry about pipelines and what not - if an admin so wishes to.

Setting {{dfs.replication}} to final does not work, cause there are create() API calls and
setrep() calls that bypass/disregard that config. Essentially thats what lead us down this
path - to use minimum, but just at meta-level, not live-block level (as it is today).
> File close()-ing hangs indefinitely if the number of live blocks does not match the minimum
> -------------------------------------------------------------------------------------------------------
>                 Key: HDFS-2936
>                 URL: https://issues.apache.org/jira/browse/HDFS-2936
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>         Attachments: HDFS-2936.patch
> If an admin wishes to enforce replication today for all the users of their cluster, he
may set {{dfs.namenode.replication.min}}. This property prevents users from creating files
with < expected replication factor.
> However, the value of minimum replication set by the above value is also checked at several
other points, especially during completeFile (close) operations. If a condition arises wherein
a write's pipeline may have gotten only < minimum nodes in it, the completeFile operation
does not successfully close the file and the client begins to hang waiting for NN to replicate
the last bad block in the background. This form of hard-guarantee can, for example, bring
down clusters of HBase during high xceiver load on DN, or disk fill-ups on many of them, etc..
> I propose we should split the property in two parts:
> * dfs.namenode.replication.min
> ** Stays the same name, but only checks file creation time replication factor value and
during adjustments made via setrep/etc.
> * dfs.namenode.replication.min.for.write
> ** New property that disconnects the rest of the checks from the above property, such
as the checks done during block commit, file complete/close, safemode checks for block availability,
> Alternatively, we may also choose to remove the client-side hang of completeFile/close
calls with a set number of retries. This would further require discussion about how a file-closure
handle ought to be handled.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message