hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian Fang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-7240) Object store in HDFS
Date Mon, 06 Jul 2015 17:04:06 GMT

    [ https://issues.apache.org/jira/browse/HDFS-7240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14615318#comment-14615318

Jian Fang commented on HDFS-7240:

Thanks for all your explanations, however, I think you missed my points. Doable and performance
are two difference concepts. From my own experiences with S3 and s3 native file system, the
most costly operations are listing keys and copying data from one bucket to the other one
to simulate the rename operation. The former one will take a very long time for a bucket with
millions of objects and the latter one has a double performance penalty, i.e., assume your
objects are 1TB, you actually almost upload 2TB of data to s3. That is why fast key listing
and native fast rename operations are two of the most desirable features for s3. 

Before you make decision to follow the S3N API, I would suggest you actually test the performance
of S3N and get to know what are good and what are bad. Why do you need to follow the bad ones
at all?

It is still not very clear to me how do you guarantee your partitions are balanced. HBase
used region auto split to achieve that, which is also my concern that the code and logic would
grow rapidly when your object store becomes really mature. In my personal opinion, it is better
to build the object store on top of HDFS and leave HDFS to be simple.


> Object store in HDFS
> --------------------
>                 Key: HDFS-7240
>                 URL: https://issues.apache.org/jira/browse/HDFS-7240
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>            Reporter: Jitendra Nath Pandey
>            Assignee: Jitendra Nath Pandey
>         Attachments: Ozone-architecture-v1.pdf
> This jira proposes to add object store capabilities into HDFS. 
> As part of the federation work (HDFS-1052) we separated block storage as a generic storage
layer. Using the Block Pool abstraction, new kinds of namespaces can be built on top of the
storage layer i.e. datanodes.
> In this jira I will explore building an object store using the datanode storage, but
independent of namespace metadata.
> I will soon update with a detailed design document.

This message was sent by Atlassian JIRA

View raw message