hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From C G <parallel...@yahoo.com>
Subject Re: Use HDFS as a long term storage solution?
Date Fri, 07 Sep 2007 00:54:00 GMT
Actually my question is 'Can you use Map/Reduce without HDFS?"

Ted Dunning <tdunning@veoh.com> wrote:  

You can definitely use HDFS without map/reduce. It should be pretty easy to
use it from a variety of languages as well, although it is unlikely that
there are language bindings available off the shelf.

It is very frustrating to have two kinds of data store, but if you are doing
two kinds of thing, it might just be justified. If you look at the
different priorities of the team, you will have a pretty simple inference as
to what the strengths of the platforms will be. With Mogile, the goal is
serving content sites with high reliability from a variety of languages
without implementing anything more complicated than needed. They ignore
issues like how to place tasks to process data efficiently. With HDFS, the
goal is to facilitate large-scale computation. The result is much less
focused on the operational issues that will arise if you are trying to keep
data alive for years at a time.

On 9/6/07 1:04 PM, "C G" 

> Do you have to use HDFS with map/reduce? I don't fully understand how closely
> bound map/reduce is to HDFS.
> In our application it might make more sense to accrue data using MogileFS
> and place post-processed data (i.e. larger data) into HDFS for additional
> processing.
> Comments?
> Ted Dunning wrote:
> Hadoop may not be what you want for storing lots and lots of files.
> If you need to store >10^7 files or if you are storing lots of small (<40MB)
> files, then you may prefer a solution like mogileFS. It is engineered for a
> very different purpose than hadoop, but may be more appropriate for what you
> want. It is also already intended for web-scale reliable applications so
> there is a bit more that you can do for redundancy.
> On the other hand, HDFS might be just what you need.
> On 9/5/07 1:03 PM, "Dongsheng Wang"
> wrote:
>> We are looking at using HDFS as a long term storage solution. We want to use
>> it to stored lots of files. The file could be big and small, they are images,
>> videos etc... We only write the files once, and may read them many times.
>> Sounds like it is perfect to use HDFS.
>> The concern is that since it¹s been engineered to support MapReduce there may
>> be fundamental assumptions that the data being stored by HDFS is transient in
>> nature. Obviously for our scalable storage solution zero data loss or
>> corruption is a heavy requirement.
>> Is anybody using HDFS as a long term storage solution? Interested in any
>> info.
>> Thanks
>> - ds
>> ---------------------------------
>> Yahoo! oneSearch: Finally, mobile search that gives answers, not web links.
> ---------------------------------
> Be a better Heartthrob. Get better relationship answers from someone who
> knows.
> Yahoo! Answers - Check it out. 

Pinpoint customers who are looking for what you sell. 
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message