hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Harsh J (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-2936) Provide a better way to specify a HDFS-wide minimum replication requirement
Date Wed, 16 May 2012 20:41:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13277099#comment-13277099
] 

Harsh J commented on HDFS-2936:
-------------------------------

Eli,

Thats already available via the current dfs.namenode.replication.min. The problem am trying
to address is that close() will hang if the above prop's set copies aren't live.

For instance, I create file with 3-rep, but get only 2 DNs in pipeline due to load or failure
for my block writes, then when I call close(), it will hang infinitely cause there's only
two replicas. Hence here I have broken the property into two bits. One that controls creation-only,
as you describe. The other controls actual writes, so those may be relaxed down to 1, to allow
files to close with 1 or 2-replicas if the situation demands so (NN anyway takes care of under
replication later).

Let me know if this clears any confusion. Also check DFSClient.close() to see the hang loop
am talking about, if NN#completeFile returns false.
                
> Provide a better way to specify a HDFS-wide minimum replication requirement
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-2936
>                 URL: https://issues.apache.org/jira/browse/HDFS-2936
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>    Affects Versions: 0.23.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>         Attachments: HDFS-2936.patch
>
>
> Currently, if an admin would like to enforce a replication factor for all files on his
HDFS, he does not have a way. He may arguably set dfs.replication.min but that is a very hard
guarantee and if the pipeline can't afford that number for some reason/failure, the close()
does not succeed on the file being written and leads to several issues.
> After discussing with Todd, we feel it would make sense to introduce a second config
(which is ${dfs.replication.min} by default) which would act as a minimum specified replication
for files. This is different than dfs.replication.min which also ensures that many replicas
are recorded before completeFile() returns... perhaps something like ${dfs.replication.min.user}.
We can leave dfs.replication.min alone for hard-guarantees and add ${dfs.replication.min.for.block.completion}
which could be left at 1 even if dfs.replication.min is >1, and let files complete normally
but not be of a low replication factor (so can be monitored and accounted-for later).
> I'm prefering the second option myself. Will post a patch with tests soon.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message