hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lance Boomerang <lr...@boomerang.com>
Subject Re: HDFS instead of NFS/NAS/DAS?
Date Fri, 21 Sep 2007 14:54:58 GMT
I have been pondering the same questions, so if you get in good 
responses , please share either on or off the list.

thanks and good luck!

Lance

Jonathan Hendler wrote:
> Hi All,
>
> I am a complete newbie to Hadoop, not having tested or installed yet,
> but reading up for about a month now in spare time, and following the
> list. I think it's really exciting to provide this kind of
> infrastructure as open source!
>
> I'll provide context for the subject of this email, and although I've
> seen a thread  or two about storing many small files in Hadoop, I'm not
> sure it addresses the following.
>
> Goal:
>
>    1. Many small files (from 1MB-2GB) 
>    2. Automated "fail-safe" redundancy
>    3. Automated synchronization of the redundancy
>    4. predictable speed as load / server count increases for read/write
>       of these files (in part or whole)
>
> The middleware having access to the files could be used, among other
> things, to:
>
>    1. track "where the files are", and their states
>    2. sync differences 
>
> My thinking is that by splitting parts of these files, even if small,
> across a number of machines, CRUD will be faster than NFS, as well as
> "safer". Also, I'm thinking that using HDFS would be cheaper than DAS /
> and more feature rich than NAS [1]. Also, it wouldn't matter "where" the
> files were in HDFS, which would simplify the complexity of the
> middleware. I also read that DHTs generally don't have intelligent load
> balancing, making HDFS type schemes more consistent.
>
> Since Hadoop is primarily designed to move the computation to where the
> data is, does it make sense to use HDFS in this way?[2] 
>
> - Jonathan
>
> [1] - http://en.wikipedia.org/wiki/Network-attached_storage#Drawbacks
> [2] - (assuming the memory limit in the master isn't reached because a
> large number of files/blocks)
>
>
>
>   


-- 
CSI Cardiff, I'd like to see that. They'd be measuring the velocity of a kebab!"



Mime
View raw message