hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11639) [READ] Encode the BlockAlias in the client protocol
Date Mon, 15 May 2017 18:41:04 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011091#comment-16011091
] 

Virajith Jalaparti commented on HDFS-11639:
-------------------------------------------

bq. If this entails a protocol change, I think it makes the most sense to do it at this point
so all the protocol changes happen up front in one change if we need this to get in for 3.0.

Agree

bq. Does it make sense to have the BlockAlias in transferBlock? If we know the targetStorageTypes
and targetStorageIDs then we can know that nothing needs to be transferred. Or is this an
issue if we want to transfer from PROVIDED to DISK?
We would need a non-null {{BlockAlias}} in {{transferBlock}} whenever {{transferBlock}} is
called for a PROVIDED replica. This will happen when (a) a data write pipeline fails mid-way
and a new datanode is added for the PROVIDED replica, and (b) a provided replica has to be
created from a Finalized (local) replica.

bq. With the pending refactoring of the FsDatasetImpl which won't have replicas a priori,
I wonder if it makes sense for the Datanode to have a FileRegionProvider or BlockProvider
at all. They are given the appropriate block ID and block alias in the readBlock or writeBlock
message. Maybe I'm overlooking what's still being provided.
I was trying to reconcile the existing design (FsDatasetImpl knows about provided blocks apriori)
with the new design where FsDatasetImpl will not know about these before but just constructs
them on-the-fly using the {{BlockAlias}} from {{readBlock}} or {{writeBlock}}. Using {{BlockProvider#resolve()}}
allows us to have both designs exist in parallel. I was wondering if we should still retain
the earlier given the latter design.



> [READ] Encode the BlockAlias in the client protocol
> ---------------------------------------------------
>
>                 Key: HDFS-11639
>                 URL: https://issues.apache.org/jira/browse/HDFS-11639
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs
>            Reporter: Ewan Higgs
>            Assignee: Ewan Higgs
>         Attachments: HDFS-11639-HDFS-9806.001.patch, HDFS-11639-HDFS-9806.002.patch
>
>
> As part of the {{PROVIDED}} storage type, we have a {{BlockAlias}} type which encodes
information about where the data comes from. i.e. URI, offset, length, and nonce value. This
data should be encoded in the protocol ({{LocatedBlockProto}} and the {{BlockTokenIdentifier}})
when a block is available using the PROVIDED storage type.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message