hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gerrit Jansen van Vuuren <gerrit...@googlemail.com>
Subject Re: HDFS without Hadoop: Why?
Date Wed, 26 Jan 2011 15:26:41 GMT
 The smallest size in HDFS is not the blocksize. The blocksize is an upper
limit, but if you store smaller files it will not take up extra space.
  HDFS is not meant for fast random access but built specifically for large
files and sequential access.


On Wed, Jan 26, 2011 at 9:59 AM, Gerrit Jansen van Vuuren <
gerritjvv@googlemail.com> wrote:

> Hi,
>
> For true data durability RAID is not enough.
> The conditions I operate on are the following:
>
> (1) Data loss is not acceptable under any terms
> (2) Data unavailability is not acceptable under any terms for any period of
> time.
> (3) Data loss for certain data sets become a legal issue and is again not
> acceptable, an might lead to loss of my employment.
> (4) Having 2 nodes fail in a month on average under for volumes we operate
> is to be expected, i.e. 100 to 400 nodes per cluster.
> (5) Having a data centre outage once a year is to be expected. (We've
> already had one this year)
>
> A word on node failure: Nodes do not just fail because of disks, any
> component can fail e.g. RAM, NetworkCard, SCSI controller, CPU etc.
>
> Now data loss or unavailability can happen under the following conditions:
> (1) Multiple of single disk failure
> (2) Node failure (a whole U goes down)
> (3) Rack failure
> (4) Data Centre failure
>
> Raid covers (1) but I do not know of any raid setup that will cover the
> rest.
> HDFS with 3 way replication covers 1,2, and 3 but not 4.
> HDFS 3 way replication with replication (via distcp) across data centres
> covers 1-4.
>
> The question to ask business is how valuable is the data in question to
> them? If they go RAID and only cover (1), they should be asked if its
> acceptable to have data unavailable with the possibility of permanent data
> loss at any point of time for any amount of data for any amount of time.
> If they come back to you and say yes we accept that if a node fails we
> loose data or that it becomes unavailable for any period of time, then yes
> go for RAID. If the answer is NO, you need replication, even DBAs understand
> this and thats why for DBs we backup, replicate and load/fail-over balance,
> why should we not do them same for critical business data on file storage?
>
>
> We run all of our nodes non raided (JBOD), because having 3 replicas means
> you don't require extra replicas on the same disk or node.
>
> Yes its true that any distributed file system will make data available to
> any number of nodes but this was not my point earlier. Having data replicas
> on multiple nodes means that data can be worked from in parallel on multiple
> physical nodes without requiring to read/copy the data from a single node.
>
> Cheers,
>  Gerrit
>
>
> On Wed, Jan 26, 2011 at 5:54 AM, Dhruba Borthakur <dhruba@gmail.com>wrote:
>
>> Hi Nathan,
>>
>> we are using HDFS-RAID for our 30 PB cluster. Most datasets have a
>> replication factor of 2.2 and a few datasets have a replication factor of
>> 1.4.  Some details here:
>>
>> http://wiki.apache.org/hadoop/HDFS-RAID
>>
>> http://hadoopblog.blogspot.com/2009/08/hdfs-and-erasure-codes-hdfs-raid.html
>>
>> thanks,
>> dhruba
>>
>>
>> On Tue, Jan 25, 2011 at 7:58 PM, <stu24mail@yahoo.com> wrote:
>>
>>> My point was it's not RAID or whatr versus HDFS. HDFS is a distributed
>>> file system that solves different problems.
>>>
>>>
>>>  HDFS is a file system. It's like asking NTFS or RAID?
>>>
>>> >but can be generally dealt with using hardware and software failover
>>> techniques.
>>>
>>> Like hdfs.
>>>
>>> Best,
>>>  -stu
>>> -----Original Message-----
>>> From: Nathan Rutman <nrutman@gmail.com>
>>> Date: Tue, 25 Jan 2011 17:31:25
>>> To: <hdfs-user@hadoop.apache.org>
>>> Reply-To: hdfs-user@hadoop.apache.org
>>> Subject: Re: HDFS without Hadoop: Why?
>>>
>>>
>>> On Jan 25, 2011, at 5:08 PM, stu24mail@yahoo.com wrote:
>>>
>>> > I don't think, as a recovery strategy, RAID scales to large amounts of
>>> data. Even as some kind of attached storage device (e.g. Vtrack), you're
>>> only talking about a few terabytes of data, and it doesn't tolerate node
>>> failure.
>>>
>>> When talking about large amounts of data, 3x redundancy absolutely
>>> doesn't scale.  Nobody is going to pay for 3 petabytes worth of disk if they
>>> only need 1 PB worth of data.  This is where dedicated high-end raid systems
>>> come in (this is in fact what my company, Xyratex, builds).  Redundant
>>> controllers, battery backup, etc.  The incremental cost for an additional
>>> drive in such systems is negligible.
>>>
>>> >
>>> > A key part of hdfs is the distributed part.
>>>
>>> Granted, single-point-of-failure arguments are valid when concentrating
>>> all the storage together, but can be generally dealt with using hardware and
>>> software failover techniques.
>>>
>>> The scale argument in my mind is exactly reversed -- HDFS works fine for
>>> smaller installations that can't afford RAID hardware overhead and access
>>> redundancy, and where buying 30 drives instead of 10 is an acceptable cost
>>> for the simplicity of HDFS setup.
>>>
>>> >
>>> > Best,
>>> > -stu
>>> > -----Original Message-----
>>> > From: Nathan Rutman <nrutman@gmail.com>
>>> > Date: Tue, 25 Jan 2011 16:32:07
>>> > To: <hdfs-user@hadoop.apache.org>
>>> > Reply-To: hdfs-user@hadoop.apache.org
>>> > Subject: Re: HDFS without Hadoop: Why?
>>> >
>>> >
>>> > On Jan 25, 2011, at 3:56 PM, Gerrit Jansen van Vuuren wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> Why would 3x data seem wasteful?
>>> >> This is exactly what you want.  I would never store any serious
>>> business data without some form of replication.
>>> >
>>> > I agree that you want data backup, but 3x replication is the least
>>> efficient / most expensive (space-wise) way to do it.  This is what RAID was
>>> invented for: RAID 6 gives you fault tolerance against loss of any two
>>> drives, for only 20% disk space overhead.  (Sorry, I see I forgot to note
>>> this in my original email, but that's what I had in mind.) RAID is also not
>>> necessarily $ expensive either; Linux MD RAID is free and effective.
>>> >
>>> >> What happens if you store a single file on a single server without
>>> replicas and that server goes, or just the disk on that the file is on goes
>>> ? HDFS and any decent distributed file system uses replication to prevent
>>> data loss. As a side affect having the same replica of a data piece on
>>> separate servers means that more than one task can work on the server in
>>> parallel.
>>> >
>>> > Indeed, replicated data does mean Hadoop could work on the same block
>>> on separate nodes.  But outside of Hadoop compute jobs, I don't think this
>>> is useful in general.  And in any case, a distributed filesystem would let
>>> you work on the same block of data from however many nodes you wanted.
>>> >
>>> >
>>>
>>>
>>
>>
>> --
>> Connect to me at http://www.facebook.com/dhruba
>>
>
>

Mime
View raw message