hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jing Zhao (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8999) Namenode need not wait for {{blockReceived}} for the last block before completing a file.
Date Tue, 29 Dec 2015 23:13:49 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15074407#comment-15074407
] 

Jing Zhao commented on HDFS-8999:
---------------------------------

bq. Here you are optimizing completeBlock(), not IBR, which I don't know how substantial it
is, if at all.

bq. I suggest that we change the code to allow closing a file even if the last block is in
COMMITTED state in this JIRA and then change DN to send IBRs in batches in another JIRA.

I think Nicholas's comment has summarized our final goal: to decrease the total number of
IBR. Currently writing a block with 3 replicas can easily generate >6 RPCs, which greatly
limits the scalability of HDFS handling small files. We should explore if we can batch IBR
and let DN send them periodically. The first step will be breaking the dependency between
{{completeFile}} and IBR.

> Namenode need not wait for {{blockReceived}} for the last block before completing a file.
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-8999
>                 URL: https://issues.apache.org/jira/browse/HDFS-8999
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: namenode
>            Reporter: Jitendra Nath Pandey
>            Assignee: Tsz Wo Nicholas Sze
>         Attachments: h8999_20151228.patch
>
>
> This comes out of a discussion in HDFS-8763. Pasting [~jingzhao]'s comment from the jira:
> {quote}
> ...whether we need to let NameNode wait for all the block_received msgs to announce the
replica is safe. Looking into the code, now we have
>    # NameNode knows the DataNodes involved when initially setting up the writing pipeline
>    # If any DataNode fails during the writing, client bumps the GS and finally reports
all the DataNodes included in the new pipeline to NameNode through the updatePipeline RPC.
>    # When the client received the ack for the last packet of the block (and before the
client tries to close the file on NameNode), the replica has been finalized in all the DataNodes.
> Then in this case, when NameNode receives the close request from the client, the NameNode
already knows the latest replicas for the block. Currently the checkReplication call only
counts in all the replicas that NN has already received the block_received msg, but based
on the above #2 and #3, it may be safe to also count in all the replicas in the BlockUnderConstructionFeature#replicas?
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message