hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Brian Bockelman <bbock...@cse.unl.edu>
Subject Re: Using HDFS to serve www requests
Date Thu, 26 Mar 2009 16:51:44 GMT

On Mar 26, 2009, at 5:44 PM, Aaron Kimball wrote:

> In general, Hadoop is unsuitable for the application you're  
> suggesting.
> Systems like Fuse HDFS do exist, though they're not widely used.

We use FUSE on a 270TB cluster to serve up physics data because the  
client (2.5M lines of C++) doesn't understand how to connect to HDFS  


> I don't
> know of anyone trying to connect Hadoop with Apache httpd.
> When you say that you have huge images, how big is "huge?" It might be
> useful if these images are 1 GB or larger. But in general, "huge" on  
> Hadoop
> means 10s of GBs up to TBs.  If you have a large number of  
> moderately-sized
> files, you'll find that HDFS responds very poorly for your needs.
> It sounds like glusterfs is designed more for your needs.
> - Aaron
> On Thu, Mar 26, 2009 at 4:06 PM, phil cryer <phil@cryer.us> wrote:
>> This is somewhat of a noob question I know, but after learning about
>> Hadoop, testing it in a small cluster and running Map Reduce jobs on
>> it, I'm still not sure if Hadoop is the right distributed file system
>> to serve web requests.  In other words, can, or is it right to, serve
>> Images and data from HDFS using something like FUSE to mount a
>> filesystem where Apache could serve images from it?  We have huge
>> images, thus the need for a distributed file system, and they go in,
>> get stored with lots of metadata, and are redundant with Hadoop/ 
>> HDFS -
>> but is it the right way to serve web content?
>> I looked at glusterfs before, they had an Apache and Lighttpd module
>> which made it simple, does HDFS have something like this, do people
>> just use a FUSE option as I described, or is this not a good use of
>> Hadoop?
>> Thanks
>> P

View raw message