Return-Path: Delivered-To: apmail-hadoop-common-dev-archive@www.apache.org Received: (qmail 15729 invoked from network); 2 Mar 2010 02:50:30 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 2 Mar 2010 02:50:30 -0000 Received: (qmail 69738 invoked by uid 500); 2 Mar 2010 02:50:27 -0000 Delivered-To: apmail-hadoop-common-dev-archive@hadoop.apache.org Received: (qmail 69702 invoked by uid 500); 2 Mar 2010 02:50:27 -0000 Mailing-List: contact common-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-dev@hadoop.apache.org Delivered-To: mailing list common-dev@hadoop.apache.org Received: (qmail 69691 invoked by uid 99); 2 Mar 2010 02:50:27 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Mar 2010 02:50:27 +0000 X-ASF-Spam-Status: No, hits=1.6 required=10.0 tests=SPF_NEUTRAL,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [209.85.160.48] (HELO mail-pw0-f48.google.com) (209.85.160.48) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 02 Mar 2010 02:50:19 +0000 Received: by pwi6 with SMTP id 6so1889423pwi.35 for ; Mon, 01 Mar 2010 18:49:57 -0800 (PST) MIME-Version: 1.0 Received: by 10.142.196.15 with SMTP id t15mr3089301wff.168.1267498197549; Mon, 01 Mar 2010 18:49:57 -0800 (PST) In-Reply-To: <363728BA-B166-4FAC-A376-AC8FE0C5780F@cse.unl.edu> References: <68432d881003010848m754a03c8vd279e7c9a90890c5@mail.gmail.com> <4B8C2312.9010905@yahoo-inc.com> <68432d881003011742y27916636y17d1d507ec5e70c7@mail.gmail.com> <363728BA-B166-4FAC-A376-AC8FE0C5780F@cse.unl.edu> Date: Mon, 1 Mar 2010 18:49:57 -0800 Message-ID: Subject: Re: Namespace partitioning using Locality Sensitive Hashing From: Eli Collins To: common-dev@hadoop.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hey Brian, Great points. Agree that federating a set of file systems via symlinks doesn't solve the general problem of scaling a namespace. Imagine GFS' "Name Spaces" was mostly useful for systems that grew w/o much need for rebalancing, eg log storage. Thanks, Eli On Mon, Mar 1, 2010 at 6:31 PM, Brian Bockelman wrot= e: > > Hey Eli, > > From past experience, static, manual namespace partitioning can really ge= t you in trouble - you have to manually keep things balanced. > > The following things can go wrong: > > 1) One of your pesky users grows unexpectedly by a factor of 10. > 2) Your entire system grew so much that there's not enough excess capacit= y to split and balance the cluster into new pieces - the extra bandwidth re= quired would drive down production performance too much (or you need downti= me to do it and can't afford the downtime). > 3) Your production system began as a proof of concept, and your file name= system makes it hard to split in a sane manner because you never planned o= n splitting the proof of concept in the first place! > > Any one of these can be solved with enough effort, but it can require a h= uge amount of effort if you don't realize things soon enough! =A0In fact, I= seem to remember a ACM Queue article with the original Google authors who = cited explosive application growth as one reason that manual balancing quic= kly fell out of favor. > > I wouldn't deny that symlinks are an incredible tool to fight namespace g= rowth - but it's not a 100% solution. > > That said, I'm looking forward to symlinks to solve a few local problems! > > Brian > > On Mar 1, 2010, at 8:15 PM, Eli Collins wrote: > >> On Mon, Mar 1, 2010 at 5:42 PM, Ketan Dixit wrot= e: >>> Hello, >>> Thank you Konstantin and =A0Allen for your reply. The information >>> provided really helped to improve my understanding. >>> However I still have few questions. >>> How Symlinks/ soft links are used to solve the probem of partitioning. >>> (Where do the symlinks point to? All the mapping is >>> stored in memory but symlinks point to file objects? This is little >>> confusing to me) >>> Can you please provide insight into this? >> >> The idea is to use symlinks to present a single namespace to clients >> that is backed by multiple file systems (hdfs or other supported >> hadoop file systems). Eg a "root" HDFS file system could contain links >> to other file systems, eg /dir1 could point to S3, /dir2 could point >> to a local file system, /dir3 could point to another HDFS file system, >> etc. Clients always contact the "root" HDFS file system but are >> transparently redirected to other file systems by symlinks. This way a >> single namespace is partitioned across multiple file systems, but the >> client only needs to know about the root file system. This >> partitioning is static (you have to establish the symlinks), though >> you can grow on the fly by adding file systems and links that point to >> them. >> >> Thanks, >> Eli > >