hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aaron Kimball <aa...@cloudera.com>
Subject Re: Using HDFS to serve www requests
Date Thu, 26 Mar 2009 16:44:25 GMT
In general, Hadoop is unsuitable for the application you're suggesting.
Systems like Fuse HDFS do exist, though they're not widely used. I don't
know of anyone trying to connect Hadoop with Apache httpd.

When you say that you have huge images, how big is "huge?" It might be
useful if these images are 1 GB or larger. But in general, "huge" on Hadoop
means 10s of GBs up to TBs.  If you have a large number of moderately-sized
files, you'll find that HDFS responds very poorly for your needs.

It sounds like glusterfs is designed more for your needs.

- Aaron

On Thu, Mar 26, 2009 at 4:06 PM, phil cryer <phil@cryer.us> wrote:

> This is somewhat of a noob question I know, but after learning about
> Hadoop, testing it in a small cluster and running Map Reduce jobs on
> it, I'm still not sure if Hadoop is the right distributed file system
> to serve web requests.  In other words, can, or is it right to, serve
> Images and data from HDFS using something like FUSE to mount a
> filesystem where Apache could serve images from it?  We have huge
> images, thus the need for a distributed file system, and they go in,
> get stored with lots of metadata, and are redundant with Hadoop/HDFS -
> but is it the right way to serve web content?
> I looked at glusterfs before, they had an Apache and Lighttpd module
> which made it simple, does HDFS have something like this, do people
> just use a FUSE option as I described, or is this not a good use of
> Hadoop?
> Thanks
> P

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message