hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject Re: Use HDFS as a long term storage solution?
Date Thu, 06 Sep 2007 20:04:25 GMT
Do you have to use HDFS with map/reduce?  I don't fully understand how closely bound map/reduce
is to HDFS.  
  In our application it might make more sense to accrue data using MogileFS  and place post-processed
data (i.e. larger data) into HDFS for additional processing.

Ted Dunning <tdunning@veoh.com> wrote:
Hadoop may not be what you want for storing lots and lots of files.

If you need to store >10^7 files or if you are storing lots of small (<40MB)
files, then you may prefer a solution like mogileFS. It is engineered for a
very different purpose than hadoop, but may be more appropriate for what you
want. It is also already intended for web-scale reliable applications so
there is a bit more that you can do for redundancy.

On the other hand, HDFS might be just what you need.

On 9/5/07 1:03 PM, "Dongsheng Wang" 

> We are looking at using HDFS as a long term storage solution. We want to use
> it to stored lots of files. The file could be big and small, they are images,
> videos etc... We only write the files once, and may read them many times.
> Sounds like it is perfect to use HDFS.
> The concern is that since it¹s been engineered to support MapReduce there may
> be fundamental assumptions that the data being stored by HDFS is transient in
> nature. Obviously for our scalable storage solution zero data loss or
> corruption is a heavy requirement.
> Is anybody using HDFS as a long term storage solution? Interested in any info.
> Thanks
> - ds
> ---------------------------------
> Yahoo! oneSearch: Finally, mobile search that gives answers, not web links. 

Be a better Heartthrob. Get better relationship answers from someone who knows.
Yahoo! Answers - Check it out. 
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message