Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: common-dev@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
MIME-Version: 1.0
In-Reply-To: <363728BA-B166-4FAC-A376-AC8FE0C5780F@cse.unl.edu>
References: <68432d881003010848m754a03c8vd279e7c9a90890c5@mail.gmail.com>
	 <4B8C2312.9010905@yahoo-inc.com>
	 <68432d881003011742y27916636y17d1d507ec5e70c7@mail.gmail.com>
	 <dfe484f01003011815r464fbb61re9bdb4c4b1a2954c@mail.gmail.com>
	 <363728BA-B166-4FAC-A376-AC8FE0C5780F@cse.unl.edu>
Date: Mon, 1 Mar 2010 18:49:57 -0800
Message-ID: <dfe484f01003011849t4b7fa59bqdb237f0b66556041@mail.gmail.com>
Subject: Re: Namespace partitioning using Locality Sensitive Hashing
From: Eli Collins <eli@cloudera.com>
To: common-dev@hadoop.apache.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

Hey Brian,

Great points. Agree that federating a set of file systems via symlinks
doesn't solve the general problem of scaling a namespace.
Imagine GFS' "Name Spaces" was mostly useful for systems that grew w/o
much need for rebalancing, eg log storage.

Thanks,
Eli

On Mon, Mar 1, 2010 at 6:31 PM, Brian Bockelman <bbockelm@cse.unl.edu> wrot=
e:
>
> Hey Eli,
>
> From past experience, static, manual namespace partitioning can really ge=
t you in trouble - you have to manually keep things balanced.
>
> The following things can go wrong:
>
> 1) One of your pesky users grows unexpectedly by a factor of 10.
> 2) Your entire system grew so much that there's not enough excess capacit=
y to split and balance the cluster into new pieces - the extra bandwidth re=
quired would drive down production performance too much (or you need downti=
me to do it and can't afford the downtime).
> 3) Your production system began as a proof of concept, and your file name=
 system makes it hard to split in a sane manner because you never planned o=
n splitting the proof of concept in the first place!
>
> Any one of these can be solved with enough effort, but it can require a h=
uge amount of effort if you don't realize things soon enough! =A0In fact, I=
 seem to remember a ACM Queue article with the original Google authors who =
cited explosive application growth as one reason that manual balancing quic=
kly fell out of favor.
>
> I wouldn't deny that symlinks are an incredible tool to fight namespace g=
rowth - but it's not a 100% solution.
>
> That said, I'm looking forward to symlinks to solve a few local problems!
>
> Brian
>
> On Mar 1, 2010, at 8:15 PM, Eli Collins wrote:
>
>> On Mon, Mar 1, 2010 at 5:42 PM, Ketan Dixit <ketan.dixit@gmail.com> wrot=
e:
>>> Hello,
>>> Thank you Konstantin and =A0Allen for your reply. The information
>>> provided really helped to improve my understanding.
>>> However I still have few questions.
>>> How Symlinks/ soft links are used to solve the probem of partitioning.
>>> (Where do the symlinks point to? All the mapping is
>>> stored in memory but symlinks point to file objects? This is little
>>> confusing to me)
>>> Can you please provide insight into this?
>>
>> The idea is to use symlinks to present a single namespace to clients
>> that is backed by multiple file systems (hdfs or other supported
>> hadoop file systems). Eg a "root" HDFS file system could contain links
>> to other file systems, eg /dir1 could point to S3, /dir2 could point
>> to a local file system, /dir3 could point to another HDFS file system,
>> etc. Clients always contact the "root" HDFS file system but are
>> transparently redirected to other file systems by symlinks. This way a
>> single namespace is partitioned across multiple file systems, but the
>> client only needs to know about the root file system. This
>> partitioning is static (you have to establish the symlinks), though
>> you can grow on the fly by adding file systems and links that point to
>> them.
>>
>> Thanks,
>> Eli
>
>