hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rakesh R (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (HDFS-12090) Handling writes from HDFS to Provided storages
Date Tue, 18 Jul 2017 15:41:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16091683#comment-16091683
] 

Rakesh R edited comment on HDFS-12090 at 7/18/17 3:40 PM:
----------------------------------------------------------

Thanks [~virajith] for the detailed explanation. Sorry for the late reply.

# Agreed. My only concern is, if someone invokes #satisfyStoragePolicy without defining the
mount point. We should have logic to gracefully handle it via fallback storages. How about
exposing a hybrid API, which would do mounting as well as internally set storage policy to
PROVIDED?
# bq. In these cases, I think we would have to change the write pipeline to not choose a PROVIDED
location but choose one of the fallbacks for it
OK, got it. Perhaps, we can discuss more on this part during the coding phase.
# If the {{-createMountOnly}} flag is specified with the mount command, the MountTask is not
created. In that case, administrator has to back up metadata, right?
# OK
# OK, makes sense.
# OK
# Yes, I'd prefer to expose a flag to change PROVIDED policy to default storage policy.
# Great!
# Yes. Since hadoop 3.0.0 has EC, many users might be using EC to save space(COLD storage
policy). I hope, they would be interested to move EC files to external store with this feature.
# Adding one more case that comes in my mind about the mount point. IIUC, PROVIDED storage
policy differs with existing storage policies, the way it binds to the respective volume.
Existing storage policies are using the volume storage type from the configuration {{dfs.datanode.data.dir}},
but provided policy depends on the dynamic mount point. Could you please tell me the details
of Datanode's PROVIDED storage volume configuration. Also, how the dynamic mount point(remote
path) info will be passed to provided volume?



was (Author: rakeshr):
Thanks [~virajith] for the detailed explanation. Sorry for the late reply.

# Agreed. My only concern is, if someone invokes without defining the mount point, we should
have logic to gracefully handle via fallback storages. How about exposing a hybrid API, which
would do mounting as well as internally set storage policy to PROVIDED? 
# bq. In these cases, I think we would have to change the write pipeline to not choose a PROVIDED
location but choose one of the fallbacks for it
OK, got it. Perhaps, we can discuss more on this part during the coding phase.
# If the {{-createMountOnly}} flag is specified with the mount command, the MountTask is not
created. In that case, administrator has to back up metadata, right?
# OK
# OK, makes sense.
# OK
# Yes, I'd prefer to expose a flag to change PROVIDED policy to default storage policy.
# Great!
# Yes. Since hadoop 3.0.0 has EC, many users might be using EC to save space(COLD storage
policy). I hope, they would be interested to move EC files to external store with this feature.
# Adding one more case that comes in my mind about the mount point. IIUC, PROVIDED storage
policy differs with existing storage policies, the way it binds to the respective volume.
Existing storage policies are using the volume storage type from the configuration {{dfs.datanode.data.dir}},
but provided policy depends on the dynamic mount point. Could you please tell me the details
of Datanode's PROVIDED storage volume configuration. Also, how the dynamic mount point(remote
path) info will be passed to provided volume?


> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>         Attachments: HDFS-12090-design.001.pdf
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external
storage systems accessible through HDFS. However, HDFS-9806 is limited to data being read
through HDFS. This JIRA will deal with how data can be written to such {{PROVIDED}} storages
from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message