hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmytro Molkov (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HDFS-1108) ability to create a file whose newly allocated blocks are automatically persisted immediately
Date Mon, 28 Jun 2010 23:22:52 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12883347#action_12883347

Dmytro Molkov commented on HDFS-1108:

1. Yes, the block information will essentially be persisted twice, on each block allocation
and on file close. Do you think that can be a problem for us? Since it is a configurable change
and this will only happen for specifically configured clusters I do not feel like this is
2. This part is tricky. I guess what can happen is: New block is allocated and then the client
immediately dies without writing data + The namenode crashes and needs a restart. When the
namenode is restarted it will have this last block as UnderConstruction and when NN tries
to release the lease on this file it will try to recover the block and will never succeed
because the block is not present on the datanodes.
However it seems that it is the same case now, when namenode is not restarted the existence
of this block in memory and absence of it on the datanodes will lead to the same problem.
Or another case that is similar is when client calls hflush and then  Namenode + client +
all datanodes that are receiving the new block crash.

Please correct me if I am wrong on this one.
All in all it seems that if the namenode crashes it may lead to the client dying and so the
probability of this happening might be higher than my first example of what might happen today?

3. I am not really sure what you mean by primary flagging this to standby, but in our case
the only channel of communication between primary and standby is in fact the edits log, so
this seemed like a reasonable way to go.

> ability to create a file whose newly allocated blocks are automatically persisted immediately
> ---------------------------------------------------------------------------------------------
>                 Key: HDFS-1108
>                 URL: https://issues.apache.org/jira/browse/HDFS-1108
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1108.patch
> The current HDFS design says that newly allocated blocks for a file are not persisted
in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on
the file persists the blocks into the transaction log. It would be nice if we can immediately
persist newly allocated blocks (as soon as they are allocated) for specific files.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message