hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tom White (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-997) Implement S3 retry mechanism for failed block transfers
Date Tue, 13 Feb 2007 21:28:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472873

Tom White commented on HADOOP-997:

> + Is it true that the Jets3tFileSystemStore#initialize runs after the retry proxy has
been set up so initialization will be retried if an Exception?  Is this intentional?  Perhaps
initialization should be outside the purview of retry.

This was not intentional. I think initialization should probably not
be retried currently (in the future I might be worth thinking through
the initialization cases that need retrying).
It looks like this might be the cause of the exception you found.

> + Minor comment: In version 2 of the retry mechanism, you'd be able to narrow retries
to happen only for certain exception types: e.g. for IOExceptions only rather than for Exception
as its currently written.

S3 should only retry for IOExceptions - I'll fix this.

> + In S3OutputStream#newBackupFile, you replace File createTempFile with your own tmp
file name fabricator.  Any reason why you didn't stay using createTempFile, just use the override
 with signature instead:  http://java.sun.com/j2se/1.5.0/docs/api/java/io/File.html#createTempFile(java.lang.String,%20java.lang.String,%20java.io.File)?
(You do this in a few places in this patch).

I did this for consistency with HDFS and also because of a recent
issue with creating files in /tmp. I agree that createTempFile with
the signature you suggest is the way to go. I'll change it.

> + In S3FileSystem.java#createDefaultStore, should the retry params be exposed as configurables?
(I'm fine if the rejoinder is that this makes for too much configuration).

Yes, this crossed my mind and I think we should have it. It also
occurred to me that with annotation driven retries (HADOOP-601) you
wouldn't be able to expose the parameters as configurables (as far as
I know).

> + On initialize, should Jets3tFileSystemStore clean up its tmp directory?  (If crash,
deleteOnExit won't get a chance to run).

Possibly. Although the files should be deleted after transfer anyway.
The deleteOnExit was a backup, but you are right that it doesn't
always run.

> + If an IOException in retreiveBlock in Jets3tFileSystemStore, you do a closeQuietly
on the out stream.   Its followed by a finally that does closeQuietly on both the in and out
stream.  I don't follow whats going on here.  Why not just leave all closing to the finally
block?  (Suggest adding a comment on why the special case handling of out stream in IOE block).

On failure I want to delete the file, but to do this I close the
stream first. I agree a better comment is needed.

Thanks for the detailed feedback. I'll look at making improvements
over the coming days.

> Implement S3 retry mechanism for failed block transfers
> -------------------------------------------------------
>                 Key: HADOOP-997
>                 URL: https://issues.apache.org/jira/browse/HADOOP-997
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: fs
>    Affects Versions: 0.11.0
>            Reporter: Tom White
>         Assigned To: Tom White
>         Attachments: HADOOP-997.patch
> HADOOP-882 improves S3FileSystem so that when certain communications problems with S3
occur the operation is retried. However, the retry mechanism cannot handle a block transfer
failure, since blocks may be very large and we don't want to buffer them in memory. This improvement
is to write a wrapper (using java.lang.reflect.Proxy if possible - see discussion in HADOOP-882)
that can retry block transfers.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message