hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7240) Object store in HDFS
Date Fri, 17 Nov 2017 23:13:02 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16257730#comment-16257730
] 

Andrew Wang commented on HDFS-7240:
-----------------------------------

Some Hortonworkers and Clouderans met yesterday, here are my meeting notes. I wanted to get
them up before the broader meeting today. I already sent these around to the attendees, but
please comment if I got anything incorrect.

Attendees: ATM, Andrew, Anu, Aaron Fabbri, Jitendra, Sanjay, other listeners on the phone

High-level questions raised:

* Wouldn't Ozone be better off as a separate project?
* Why should it be merged now?

Things we agree on:

* We're all on Team Ozone, and applaud any effort to address scaling HDFS.
* There are benefits to Ozone being a separate project. Can release faster, iterate more quickly
on feedback, and mature without having to worry about features like high-availability, security,
encryption, etc. that not all customers need.
    * No agreement on whether the benefits of separation outweigh the downsides.

Discussion:

* Anu: Don't want to have this separate since it confuses people about the long-term vision
of Ozone. It's intended as block management for HDFS.
    * Andrew: In its current state, Ozone cannot be plugged into the NN as the BM layer, so
it seems premature to merge. Can't benefit existing users, and they can't test it.
    * Response: The Ozone block layer is at a good integration point, and we want to move
onto the NameNode changes like splitting the FSN/BM lock.
    * Andrew: We can do the FSN/BM lock split without merging Ozone. Separate efforts. This
lock split is also a major effort by itself, and is a dangerous change. It's something that
should be baked in production.
* Sanjay: Ozone developers "willing to take the hit" of the slow Hadoop release cadence. Want
to make this part of HDFS since it's easier for users to test and consume without installing
a new cluster.
    * ATM: Can still share the same hardware, and run the Ozone daemons alongside.
* Sanjay: Want to keep Ozone block management inside the Datanode process to enable a fast-copy
between HDFS and Ozone. Not all data needs all the HDFS features like encryption, erasure
coding, etc, and this data could be stored in Ozone.
    * Andrew: This fast-copy hasn't been implemented or discussed yet. Unclear if it'll work
at all with existing HDFS block management. Won't work with encryption or erasure coding.
Not clear whether it requires being in the same DN process even.
* Sanjay/Anu: Ozone is also useful to test with just the key-value interface. It's a Hadoop-compatible
FileSystem, so apps that work on S3 will work on Ozone too.
    * Andrew: If it provides a new API and doesn't support the HDFS feature-set, doesn't this
support it being its own project?

Summary

* No consensus on the high-level questions raised
* Ozone could be its own project and integrated later, or remain on an HDFS branch
* Without the FSN/BM lock split, it can't serve as the block management layer for HDFS
* Without fast copy, there's no need for the to be part of the DataNode process, and it might
not need to be in the same process anyway.

> Object store in HDFS
> --------------------
>
>                 Key: HDFS-7240
>                 URL: https://issues.apache.org/jira/browse/HDFS-7240
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: HDFS Scalability and Ozone.pdf, HDFS-7240.001.patch, HDFS-7240.002.patch,
HDFS-7240.003.patch, HDFS-7240.003.patch, HDFS-7240.004.patch, HDFS-7240.005.patch, HDFS-7240.006.patch,
MeetingMinutes.pdf, Ozone-architecture-v1.pdf, Ozonedesignupdate.pdf, ozone_user_v0.pdf
>
>
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a generic storage
layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the
storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode storage, but
independent of namespace metadata.
> I will soon update with a detailed design document.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message