hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Paul Sutter" <sut...@gmail.com>
Subject Re: Hadoop Distributed File System requirements on Wiki
Date Thu, 06 Jul 2006 19:02:24 GMT

I've reviewed the list of proposed changes and they look excellent.

A few suggestions:

*Constant size file blocks (#16),  -1*

I vote to keep variable size blocks, especially because you are adding
atomic append capabilities (#25). Variable length blocks creates the
possibility for blocks that contain only whole records. This:
- improves recoverability for large important files with one or more
irrevocably lost blocks, and
- makes it very clean for mappers to process local data blocks

*Recoverability and Availability Goals*

You might want to consider adding recoverability and availability goals.
Recoverability goals might include data lost in case of a namenode failure
(today its about a year, but it could be day-hour-minute-second-zero at
varying costs). If we have a statistically inclined person on the project,
we could estimate the acceptable block loss probabilities at scale.
Availability goals are probably less stringent than for most storage systems
(dare I say that a few hours downtime is probably OK) Adding these goals to
the document could be valuable for consensus and prioritization.
*Backup Scheme*
We might want to start discussion of a backup scheme for HDFS, especially
given all the courageous rewriting and feature-addition likely to occur.
We've looked carefully at this, and we think that we can get back into
production by restoring only a subset of the data into our system, but we're
likely to need an effective backup tool to do this.

*Rebalancing (#22,#21)*

I would suggest that keeping disk usage balanced is more than a performance
feature, its important for the success of running jobs with large map
outputs or large sorts. Our most common reducer failure is running out of
disk space during sort, and this is caused by imbalanced block allocation.



On 6/30/06, Konstantin Shvachko <shv@yahoo-inc.com> wrote:
> I've created a Wiki page that summarizes DFS requirements and proposed
> changes.
> This is a summary of discussions held in this mailing list and
> additional internal discussions.
> The page is here:
> http://wiki.apache.org/lucene-hadoop/DFS_requirements
> I see there is an ongoing related discussion in HADOOP-337.
> We prioritized our goals as
> (1) Reliability (which includes Recoverability and Availability)
> (2) Scalability
> (3) Functionality
> (4) Performance
> (5) other
> But then gave higher priority to some features like the append
> functionality.
> Happy holidays to everybody.
> --Konstantin Shvachko

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message