hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject Re: Use HDFS as a long term storage solution?
Date Thu, 06 Sep 2007 00:38:58 GMT
Our intention is to use HDFS as the core of a large "data repository".  We store "raw" data
within HDFS on a more-or-less permanent basis, and map/reduce it to produce load files for
our data warehouse.  We have other plans as well all centered around storing data on a very
long term basis in HDFS.  So you're in good company...
  Our plan is for a 64T HDFS repository, with a replication factor of 3 for a ~21T data space.
  C G

Dongsheng Wang <phidecn@yahoo.com> wrote:
We are looking at using HDFS as a long term storage solution. We want to use it to stored
lots of files. The file could be big and small, they are images, videos etc... We only write
the files once, and may read them many times. Sounds like it is perfect to use HDFS.

The concern is that since itÂ’s been engineered to support MapReduce there may be fundamental
assumptions that the data being stored by HDFS is transient in nature. Obviously for our scalable
storage solution zero data loss or corruption is a heavy requirement.

Is anybody using HDFS as a long term storage solution? Interested in any info. Thanks

- ds

Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. 

Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. 
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message