Return-Path: Delivered-To: apmail-hadoop-core-user-archive@www.apache.org Received: (qmail 75745 invoked from network); 26 Mar 2009 16:52:26 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 26 Mar 2009 16:52:26 -0000 Received: (qmail 27604 invoked by uid 500); 26 Mar 2009 16:52:20 -0000 Delivered-To: apmail-hadoop-core-user-archive@hadoop.apache.org Received: (qmail 27563 invoked by uid 500); 26 Mar 2009 16:52:20 -0000 Mailing-List: contact core-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-user@hadoop.apache.org Delivered-To: mailing list core-user@hadoop.apache.org Received: (qmail 27547 invoked by uid 99); 26 Mar 2009 16:52:20 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 16:52:20 +0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=RCVD_IN_DNSWL_LOW,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (athena.apache.org: local policy) Received: from [129.93.181.2] (HELO mathstat.unl.edu) (129.93.181.2) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 Mar 2009 16:52:12 +0000 Received: from [10.30.5.193] (cat58-r92.cesnet.cz [195.113.144.102]) (authenticated bits=0) by mathstat.unl.edu (8.13.8/8.13.8) with ESMTP id n2QGpjbK003083 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Thu, 26 Mar 2009 11:51:48 -0500 Cc: phil@cryer.us Message-Id: From: Brian Bockelman To: core-user@hadoop.apache.org In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v930.3) Subject: Re: Using HDFS to serve www requests Date: Thu, 26 Mar 2009 17:51:44 +0100 References: <3a3bc55a0903260806i3c49bc05jc627ed99df29e415@mail.gmail.com> X-Mailer: Apple Mail (2.930.3) X-Virus-Checked: Checked by ClamAV on apache.org On Mar 26, 2009, at 5:44 PM, Aaron Kimball wrote: > In general, Hadoop is unsuitable for the application you're > suggesting. > Systems like Fuse HDFS do exist, though they're not widely used. We use FUSE on a 270TB cluster to serve up physics data because the client (2.5M lines of C++) doesn't understand how to connect to HDFS directly. Brian > I don't > know of anyone trying to connect Hadoop with Apache httpd. > > When you say that you have huge images, how big is "huge?" It might be > useful if these images are 1 GB or larger. But in general, "huge" on > Hadoop > means 10s of GBs up to TBs. If you have a large number of > moderately-sized > files, you'll find that HDFS responds very poorly for your needs. > > It sounds like glusterfs is designed more for your needs. > > - Aaron > > On Thu, Mar 26, 2009 at 4:06 PM, phil cryer wrote: > >> This is somewhat of a noob question I know, but after learning about >> Hadoop, testing it in a small cluster and running Map Reduce jobs on >> it, I'm still not sure if Hadoop is the right distributed file system >> to serve web requests. In other words, can, or is it right to, serve >> Images and data from HDFS using something like FUSE to mount a >> filesystem where Apache could serve images from it? We have huge >> images, thus the need for a distributed file system, and they go in, >> get stored with lots of metadata, and are redundant with Hadoop/ >> HDFS - >> but is it the right way to serve web content? >> >> I looked at glusterfs before, they had an Apache and Lighttpd module >> which made it simple, does HDFS have something like this, do people >> just use a FUSE option as I described, or is this not a good use of >> Hadoop? >> >> Thanks >> >> P >>