Return-Path: Delivered-To: apmail-lucene-hadoop-dev-archive@locus.apache.org Received: (qmail 29066 invoked from network); 12 Nov 2006 22:12:02 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Nov 2006 22:12:02 -0000 Received: (qmail 25484 invoked by uid 500); 12 Nov 2006 22:12:12 -0000 Delivered-To: apmail-lucene-hadoop-dev-archive@lucene.apache.org Received: (qmail 25435 invoked by uid 500); 12 Nov 2006 22:12:11 -0000 Mailing-List: contact hadoop-dev-help@lucene.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hadoop-dev@lucene.apache.org Delivered-To: mailing list hadoop-dev@lucene.apache.org Received: (qmail 25426 invoked by uid 99); 12 Nov 2006 22:12:11 -0000 Received: from herse.apache.org (HELO herse.apache.org) (140.211.11.133) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Nov 2006 14:12:11 -0800 X-ASF-Spam-Status: No, hits=0.0 required=10.0 tests= X-Spam-Check-By: apache.org Received: from [140.211.11.4] (HELO brutus.apache.org) (140.211.11.4) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 12 Nov 2006 14:12:00 -0800 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 331F9714310 for ; Sun, 12 Nov 2006 14:11:40 -0800 (PST) Message-ID: <30398578.1163369500205.JavaMail.jira@brutus> Date: Sun, 12 Nov 2006 14:11:40 -0800 (PST) From: "Tom White (JIRA)" To: hadoop-dev@lucene.apache.org Subject: [jira] Commented: (HADOOP-574) want FileSystem implementation for Amazon S3 In-Reply-To: <21744554.1159980319607.JavaMail.root@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ http://issues.apache.org/jira/browse/HADOOP-574?page=comments#action_12449188 ] Tom White commented on HADOOP-574: ---------------------------------- Thanks Doug. Collaboration sounds good: I'll contact Jim directly. Regarding HADOOP-571 I agree it makes sense to tackle this in conjunction. I'll have a look at it after we get the basics of the S3 filesystem working. As far as the design goes I agree that (like DFS) the S3 filesystem should divide things into blocks and buffer them to disk before writing them to S3. I'm not sure about using putting the block number at the end of the filename (using a delimiter) since this makes renames very inefficient as S3 has no rename operation. Instead I have opted for a level of indirection whereby the S3 object at the filename is a metadata file which lists the block IDs that hold the data. A rename then is simply a re-PUT of the metadata. What do you think? The other aspect I haven't put much thought into yet is locking. Keeping the number of HTTP requests to a minimum will be an interesting challenge. > want FileSystem implementation for Amazon S3 > -------------------------------------------- > > Key: HADOOP-574 > URL: http://issues.apache.org/jira/browse/HADOOP-574 > Project: Hadoop > Issue Type: New Feature > Components: fs > Reporter: Doug Cutting > > An S3-based Hadoop FileSystem would make a great addition to Hadoop. > It would facillitate use of Hadoop on Amazon's EC2 computing grid, as discussed here: > http://www.mail-archive.com/hadoop-user@lucene.apache.org/msg00318.html > This is related to HADOOP-571, which would make Hadoop's FileSystem considerably easier to extend. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira