hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack@archive.org (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-882) S3FileSystem should retry if there is a communication problem with S3
Date Thu, 08 Feb 2007 02:25:05 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

stack@archive.org updated HADOOP-882:
-------------------------------------

    Attachment: jets3t-upgrade.patch
                jets3t-0.5.0.jar

Here's a patch that makes the minor changes necessary so the s3 implementation can use the
new 0.5.0 jets3t 'retrying' lib.  It also exposes fs.s3.block.size in hadoop-default.xml with
a note about how to set the jets3t RepeatableInputStream buffer size by adding a jets3t.properties
to ${HADOOP_HOME}/conf.  Setting this latter buffer to the same as the s3 block size avoids
failures of the kind 'Input stream is not repeatable as 1048576 bytes have been written, exceeding
the available buffer size of 131072'.

Downside to this patch's approach is that if you want to match block and buffer size, you
need to set the same value in two places: once in hadoop-site and again in jets3t.properties.
 This seemed to be me better than the alternative, a tighter coupling bubbling the main jets3t
properties up into hadoop-*.xml filesystem section as fs.s3.jets3t.XXX properties with the
init of the s3 filesystem setting the values into the  org.jets3t.service.Jets3tProperties.

I didn't change the default S3 block size from 1MB.  Setting it to 64MB seems too far afield
from the default jets3t RepeatableInputStream size of 100k only.

I've included the 0.5.0 jets3t lib as part of the upload (There doesn't seem to be a way to
include binaries using svn diff).  Its license is apache 2.

Tom White, thanks for pointing me at the unit test.  Also, I'd go along with closing this
issue with the update of jets3t lib  opening another issue for tracking the S3 filesystems
implementing a general, 'traffic-level' hadoop retry mechanism.

> S3FileSystem should retry if there is a communication problem with S3
> ---------------------------------------------------------------------
>
>                 Key: HADOOP-882
>                 URL: https://issues.apache.org/jira/browse/HADOOP-882
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.10.1
>            Reporter: Tom White
>         Assigned To: Tom White
>         Attachments: jets3t-0.5.0.jar, jets3t-upgrade.patch
>
>
> File system operations currently fail if there is a communication problem (IOException)
with S3. All operations that communicate with S3 should retry a fixed number of times before
failing.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message