hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eli Collins (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-1108) Log newly allocated blocks
Date Fri, 12 Aug 2011 19:54:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-1108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13084334#comment-13084334
] 

Eli Collins commented on HDFS-1108:
-----------------------------------

We have to log sync after every allocation right, or do you mean only when HA is enabled?
We could consider batching syncs at the cost of response latency.

Piggy backing the initial allocation on the create makes a lot of sense (also saves a RT).
This might actually simplify DataStreamer as it would always start with a block (either the
last block for append or the initial block for create). I think we should do the piggy back
in a separate change and use the approach in the current patch (intern config option). Seems
like we should rename it something more intuitive like dfs.log.block.allocs.

I think we need to log block abandonement as well, otherwise if the standby becomes active
and it doesn't see the block as abandoned then it will confused as to why DNs are deleting
blocks that it currently thinks are part of the file.

> Log newly allocated blocks
> --------------------------
>
>                 Key: HDFS-1108
>                 URL: https://issues.apache.org/jira/browse/HDFS-1108
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: HDFS-1108.patch, hdfs-1108-habranch.txt
>
>
> The current HDFS design says that newly allocated blocks for a file are not persisted
in the NN transaction log when the block is allocated. Instead, a hflush() or a close() on
the file persists the blocks into the transaction log. It would be nice if we can immediately
persist newly allocated blocks (as soon as they are allocated) for specific files.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message