hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milind Bhandarkar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1108) Log newly allocated blocks
Date Wed, 24 Aug 2011 06:02:29 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090017#comment-13090017
] 

Milind Bhandarkar commented on HDFS-1108:
-----------------------------------------

Sorry to come to this party late.

Stepping back a little bit, creation of a block has to be a two-phase transaction, show the
intent to create a block (and log it), write data (and note that it is being written), and
close the block (commit it), in order to recover from failures (whether in a single NN fail-manual
restart mode, or fail-automatic restart mode, or fail-backup restart mode). So, no one is
contesting that intent to create block, and committing the block both need to be logged. (data
being written need not be logged because it can be merged with the intent to create block,
without any availability degradation.) Now, the only issue that remains to be discussed, is,
is the code assuming that the logs are written to shared storage specifically, or if it generalizes
and writes to stream, which is guaranteed to be persisted by the stream broker.

I think the latter is where everyone in the community would like to be. All that the patch
contributor needs to make sure that this eventual goal is not blocked by current changes.

Todd, do you think one could enable persistent, guaranteed streams, without shared storage
assumption with this patch ?

> Log newly allocated blocks
> --------------------------
>
>                 Key: HDFS-1108
>                 URL: https://issues.apache.org/jira/browse/HDFS-1108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt, hdfs-1108.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not persisted
in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on
the file persists the blocks into the transaction log. It would be nice if we can immediately
persist newly allocated blocks (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message