hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeff Hammerbacher" <jeff.hammerbac...@gmail.com>
Subject Re: Use HDFS as a long term storage solution?
Date Thu, 06 Sep 2007 00:56:53 GMT
We have very similar plans for Hadoop to what C G quotes below, but we've
found the stability of HDFS to be quite troublesome.  We've corrupted HDFS
three different ways in a few weeks: 1) running jStack on the Namenode; 2)
loading lots of small files into HDFS, causing it to hang on a Map/Reduce
job and subsequently display corruption on restart; 3) upgrading to a newer
version of Hadoop.  Thus we are very uncertain about treating HDFS as a
reliable long-term data store.

That being said, we're excited about the opportunities created by Hadoop so
we're going to put some time into making it more reliable and creating a
utility to archive data out of HDFS for backup purposes.

On 9/5/07, C G <parallelguy@yahoo.com> wrote:
>
> Our intention is to use HDFS as the core of a large "data repository".  We
> store "raw" data within HDFS on a more-or-less permanent basis, and
> map/reduce it to produce load files for our data warehouse.  We have other
> plans as well all centered around storing data on a very long term basis in
> HDFS.  So you're in good company...
>
>   Our plan is for a 64T HDFS repository, with a replication factor of 3
> for a ~21T data space.
>
>   C G
>
>
> Dongsheng Wang <phidecn@yahoo.com> wrote:
>
> We are looking at using HDFS as a long term storage solution. We want to
> use it to stored lots of files. The file could be big and small, they are
> images, videos etc... We only write the files once, and may read them many
> times. Sounds like it is perfect to use HDFS.
>
> The concern is that since it's been engineered to support MapReduce there
> may be fundamental assumptions that the data being stored by HDFS is
> transient in nature. Obviously for our scalable storage solution zero data
> loss or corruption is a heavy requirement.
>
> Is anybody using HDFS as a long term storage solution? Interested in any
> info. Thanks
>
> - ds
>
>
> ---------------------------------
> Yahoo! oneSearch: Finally, mobile search that gives answers, not web
> links.
>
>
> ---------------------------------
> Ready for the edge of your seat? Check out tonight's top picks on Yahoo!
> TV.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message