hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Anu Engineer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11118) Block Storage for HDFS
Date Fri, 11 Nov 2016 23:30:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15658515#comment-15658515

Anu Engineer commented on HDFS-11118:

[~jpallas] Thank for taking time out to comment. I really appreciate you voicing your concerns.
I will try to address each of these points in the following sections.
bq. it's cool stuff that is useful, but I just am not convinced that it belongs in the existing
HDFS service

I am not sure if you have seen all the discussions in HDFS­-5477 and HDFS-7240. Ozone(HDFS-7240)
borrows from the ideas of HDFS-5477. Storage containers create a block manager that is separate
from Namenode. The primary rationale of Block Management as a service is the scaling of HDFS.
This is the same problem that we were facing when building Ozone. Separating out Block Management
and Namespace management allows us to build different kind of namespaces on top on Block management
as a Service (Storage Containers). So HDFS, Ozone and cBlock are different kind of namespaces
on the same block service. In fact, due to the block management separation, you will see that
cBlock, Ozone and HDFS itself becomes much simpler. They now only have to deal with namespace

So what we have is a unified block storage manager and a set of Namespace managers. For any
storage service to work, it must have both name services and block services. Hence the choice
to place this service alongside HDFS/Ozone.

Please watch HDFS-10419 if you would like to see how HDFS will evolve to use Storage Containers.

bq. Like Ozone, it increases the complexity of the datanode, and the datanode already has
a history of being, well, rather buggy.

Respectfully, I disagree. Storage Containers allows us create a data node that can reduce
the memory requirements of Namenode and will create a simpler datanode . Storage containers
will allow us have good clear separation of name space and block space. In fact that core
thesis is separation of these components will allow us to scale better. If you are interested
in a deep discussion of this, please let me know and we can discuss that in depth in some
of the ozone JIRAs.

bq. I'm also frankly astonished that there's a feature branch with work being committed already
without any discussion on this proposal, but maybe I just don't have a good understanding
of Hadoop norms.

My apologies, if you think I did not wait long enough for the community to comment. It is
not my intention to short circuit that process at all. In fact I am eager to hear community
I posted this JIRA on monday and we are posting these patches in *Ozone* branch. It was my
intention to share with community not only the proposal, but some code that gives you better
understanding of what is being proposed. If you think we should open a new branch ( after
due discussion) I am more than willing to do so. Please let me know if that is something that
would address your concern.

> Block Storage for HDFS
> ----------------------
>                 Key: HDFS-11118
>                 URL: https://issues.apache.org/jira/browse/HDFS-11118
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: hdfs
>            Reporter: Anu Engineer
>            Assignee: Anu Engineer
>         Attachments: cblock-proposal.pdf
> This JIRA proposes extending HDFS to provide replicated block storage capabilities using
Storage Containers. This is would allow users to run unmodified programs that assume that
they are running on a posix file system.
> With this extension, HDFS can be used like a block store. For example, YARN jobs could
mount and use a volume at will. This is made possible by leveraging Storage Containers and
will share the storage layer with Ozone and HDFS in future.
> Please see the attached design document for more details on this proposal.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org

View raw message