hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Konstantin Shvachko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-8286) Scaling out the namespace using KV store
Date Tue, 05 May 2015 23:19:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-8286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529524#comment-14529524
] 

Konstantin Shvachko commented on HDFS-8286:
-------------------------------------------

Hey guys, I read the design doc, and is wondering  _what is the exact goal of this jira?_
>From the design and the descriptions it is not quite clear if you propose to rebase single
NameNode on LevelDB, by replacing say {{FSDirectory}} with the KV store, or target building
a distributed namepsace service.
I am asking because I've always been interested in evolving HDFS towards distributing its
namepsace in general, and using KV stores for it, in  particular. [The Giraffa project|https://github.com/GiraffaFS/giraffa]
has been dedicated to this goal for a few years now, [as most of you are probably well aware
of|http://www.slideshare.net/Hadoop_Summit/dynamic-namespace-partitioning-with-giraffa-file-system].

Notes on the design document:
# You probably want _a support for a more generic notion of a {{Key}}_.
Your definition of {{key = <parentId, fileName>}} is well understood, and was probably
first introduced around 1995 in treeFS, the predecessor of reiserFS, the predecessor of Btrfs,
with the latter mentioned in your design. It keeps files of the same directory close to each
other (locality).
But in larger storage systems more flexibility in defining locality may be needed. E.g. of
using two-level keys {{<ppid, pid, file>}}, (which includes the locality of adjacent
directories), or three-level keys, or full-path keys as in Ceph.
E.g., Giraffa introduces a generic Key interface, which allows different implementations including
the one you describe.
And your design of KV-implementation of snapshots seems to go along these lines.
# _What motivates the choice of levelDB?_
It is a well recognized KV storage library. But it is not a distributed KV-store. So, what
is the plan here?
In Giraffa the KV store is designed to be pluggable and we currently use HBase implementation.
We also considered: levelDB, [mapDB|http://www.mapdb.org], [Redis|https://github.com/GiraffaFS/giraffa/wiki/Redis:-applicability-to-Giraffa],
GemFire aka [Apache incubator Geode|https://wiki.apache.org/incubator/GeodeProposal], [Apache
incubator Ignite|http://ignite.incubator.apache.org], [Prevayler|http://prevayler.org/], among
a few others.
# The HA support paragraph talks about a single active NN and a standby NN. It is not clear
_what is proposed for a distributed namespace, if anything?_

So, back to the starting question - what is the main goal for the issue? We may find some
forms of collaboration between the projects.

> Scaling out the namespace using KV store
> ----------------------------------------
>
>                 Key: HDFS-8286
>                 URL: https://issues.apache.org/jira/browse/HDFS-8286
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>         Attachments: hdfs-kv-design.pdf
>
>
> Currently the NN keeps the namespace in the memory. To improve the scalability of the
namespace, users can scale up by using more RAM or scale out using Federation (i.e., statically
partitioning the namespace).
> We would like to remove the limitation of scaling the global namespace. Our vision is
that that HDFS should adopt a scalable underlying architecture that allows the global namespace
scales linearly.
> We propose to implement the HDFS namespace on top of a key-value (KV) store. Adopting
the KV store interfaces allows HDFS to leverage the capability of modern KV store and to become
much easier to scale. Going forward, the architecture allows distributing the namespace across
multiple machines, or  storing only the working set in the memory (HDFS-5389), both of which
allows  HDFS to manage billions of files using the commodity hardware available today.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message