hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Virajith Jalaparti (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-12090) Handling writes from HDFS to Provided storages
Date Mon, 24 Jul 2017 22:06:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-12090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16099178#comment-16099178
] 

Virajith Jalaparti commented on HDFS-12090:
-------------------------------------------

Hi [~rakeshr], Sorry about the delayed response!

bq.  it looks to me that user has to set the PROVIDED storage policy explicitly.
This is the case only if {{-createMountOnly}} is specified. If not, the policy is automatically
set, and the data moves are initiated in the Namenode (using SPS).

bq. I thought of passing another optional argument -storagePolicy to the mount cmd and user
get the chance to pass the desired policies
That's a good idea. We didn't really think about different types of {{PROVIDED}} policies
(e.g. as you mentioned, {{DISK:2, PROVIDED:1}}, {{SSD:1, PROVIDED:1}}) but I think this makes
sense. We can add this in.

bq. So, this requires user intervention to configure the volume details and reload data volume,
right?
Not necessarily. Once the mount is setup on the Namenode, it can instruct the datanodes to
load the volume required for the mount. However, we would need to know what volume should
be mounted (can be specified by a configuration parameter or as part of the mount command),
and which datanodes should take part in this process.

bq. Secondly, are you saying that user mount Vs volume is one-to-one mapping(I meant, for
each mount point admin need to define a unique volume)?. IMHO, this can be one-to-many mapping.
I have been thinking about this as a 1-1 mapping. So, each mount point will have a different
volume (on the Datanodes). This makes it easier to manage things like credentials to access
the remote store as different mount points can belong to different remote storage accounts.
In a one-to-many mapping, these would have to be specifically managed within the volume. Do
you have any particular use-case/scenario in mind where a one to mapping might be better/more
performant?


> Handling writes from HDFS to Provided storages
> ----------------------------------------------
>
>                 Key: HDFS-12090
>                 URL: https://issues.apache.org/jira/browse/HDFS-12090
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Virajith Jalaparti
>         Attachments: HDFS-12090-design.001.pdf
>
>
> HDFS-9806 introduces the concept of {{PROVIDED}} storage, which makes data in external
storage systems accessible through HDFS. However, HDFS-9806 is limited to data being read
through HDFS. This JIRA will deal with how data can be written to such {{PROVIDED}} storages
from HDFS.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message