hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jordan Mendelson (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HADOOP-10400) Incorporate new S3A FileSystem implementation
Date Tue, 11 Mar 2014 02:05:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-10400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jordan Mendelson updated HADOOP-10400:
--------------------------------------

    Description: 
The s3native filesystem has a number of limitations (some of which were recently fixed by
HADOOP-9454). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t
library. There are a number of improvements over s3native including:

- Parallel copy (rename) support (dramatically speeds up commits on large files)
- AWS S3 explorer compatible empty directories files "xyz/" instead of "xyz_$folder$" (reduces
littering)
- Ignores s3native created _$folder$ files created by s3native and other S3 browsing utilities
- Supports multiple output buffer dirs to even out IO when uploading files
- Supports IAM role-based authentication
- Allows setting a default canned ACL for uploads (public, private, etc.)
- Better error recovery handling
- Should handle input seeks without having to download the whole file (used for splits a lot)

This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various pom
files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4 for
several months and CDH 5 for a few days. The version here is 0.0.2 which changes around some
keys to hopefully bring the key name style more inline with the rest of hadoop 2.x.

*Caveats*:

Hadoop uses a standard output committer which uploads files as filename.COPYING before renaming
them. This can cause unnecessary performance issues with S3 because it does not have a rename
operation and S3 already verifies uploads against an md5 that the driver sets on the upload
request. While this FileSystem should be significantly faster than the built-in s3native driver
because of parallel copy support, you may want to consider setting a null output committer
on our jobs to further improve performance.

Because S3 requires the file length be known before a file is uploaded, all output is buffered
out to a temporary file first similar to the s3native driver.

Due to the lack of native rename() for S3, renaming extremely large files or directories make
take a while. Unfortunately, there is no way to notify hadoop that progress is still being
made for rename operations, so your job may time out unless you increase the task timeout.

This driver will fully ignore _$folder$ files. This was necessary so that it could interoperate
with repositories that have had the s3native driver used on them, but means that it won't
recognize empty directories that s3native has been used on.

Statistics for the filesystem may be calculated differently than the s3native filesystem.
When uploading a file, we do not count writing the temporary file on the local filesystem
towards the local filesystem's written bytes count. When renaming files, we do not count the
S3->S3 copy as read or write operations. Unlike the s3native driver, we only count bytes
written when we start the upload (as opposed to the write calls to the temporary local file).
The driver also counts read & write ops, but they are done mostly to keep from timing
out on large s3 operations.

This is currently implemented as a FileSystem and not a AbstractFileSystem.

  was:
The s3native filesystem has a number of limitations (some of which were recently fixed by
HADOOP-9454). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t
library. There are a number of improvements over s3native including:

- Parallel copy (rename) support (dramatically speeds up commits on large files)
- AWS S3 explorer compatible empty directories files "xyz/" instead of "xyz_$folder$" (reduces
littering)
- Ignores s3native created _$folder$ files created by s3native and other S3 browsing utilities
- Supports multiple output buffer dirs to even out IO when uploading files
- Supports IAM role-based authentication
- Allows setting a default canned ACL for uploads (public, private, etc.)
- Better error recovery handling
- Should handle input seeks without having to download the whole file (used for splits a lot)

This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various pom
files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4 for
several months and CDH 5 for a few days. The version here is 0.0.2 which changes around some
keys to hopefully bring the key name style more inline with the rest of hadoop 2.x.

It should be largely compatible with s3native except that it won't recognize s3native's empty
directory marker files "*_$folder$" since it uses "folder/" like the Amazon's S3 explorer
to denote empty directories.

Other caveats:

Hadoop uses a standard output committer which uploads files as filename.COPYING before renaming
them. This can cause unnecessary performance issues with S3 because it does not have a rename
operation and S3 already verifies uploads against an md5 that the driver sets on the upload
request. While this FileSystem should be significantly faster than the built-in s3native driver
because of parallel copy support, you may want to consider setting a null output committer
on our jobs to further improve performance.

Because S3 requires the file length be known before a file is uploaded, all output is buffered
out to a temporary file first similar to the s3native driver.

Due to the lack of native rename() for S3, renaming extremely large files or directories make
take a while. Unfortunately, there is no way to notify hadoop that progress is still being
made for rename operations, so your job may time out unless you increase the task timeout.

Statistics for the filesystem may be calculated differently than the s3native filesystem.
When uploading a file, we do not count writing the temporary file on the local filesystem
towards the local filesystem's written bytes count. When renaming files, we do not count the
S3->S3 copy as read or write operations. Unlike the s3native driver, we only count bytes
written when we start the upload (as opposed to the write calls to the temporary local file).
The driver also counts read & write ops, but they are done mostly to keep from timing
out on large s3 operations.

This is currently implemented as a FileSystem and not a AbstractFileSystem.


> Incorporate new S3A FileSystem implementation
> ---------------------------------------------
>
>                 Key: HADOOP-10400
>                 URL: https://issues.apache.org/jira/browse/HADOOP-10400
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: fs
>            Reporter: Jordan Mendelson
>
> The s3native filesystem has a number of limitations (some of which were recently fixed
by HADOOP-9454). This patch adds an s3a filesystem which uses the aws-sdk instead of the jets3t
library. There are a number of improvements over s3native including:
> - Parallel copy (rename) support (dramatically speeds up commits on large files)
> - AWS S3 explorer compatible empty directories files "xyz/" instead of "xyz_$folder$"
(reduces littering)
> - Ignores s3native created _$folder$ files created by s3native and other S3 browsing
utilities
> - Supports multiple output buffer dirs to even out IO when uploading files
> - Supports IAM role-based authentication
> - Allows setting a default canned ACL for uploads (public, private, etc.)
> - Better error recovery handling
> - Should handle input seeks without having to download the whole file (used for splits
a lot)
> This code is a copy of https://github.com/Aloisius/hadoop-s3a with patches to various
pom files to get it to build against trunk. I've been using 0.0.1 in production with CDH 4
for several months and CDH 5 for a few days. The version here is 0.0.2 which changes around
some keys to hopefully bring the key name style more inline with the rest of hadoop 2.x.
> *Caveats*:
> Hadoop uses a standard output committer which uploads files as filename.COPYING before
renaming them. This can cause unnecessary performance issues with S3 because it does not have
a rename operation and S3 already verifies uploads against an md5 that the driver sets on
the upload request. While this FileSystem should be significantly faster than the built-in
s3native driver because of parallel copy support, you may want to consider setting a null
output committer on our jobs to further improve performance.
> Because S3 requires the file length be known before a file is uploaded, all output is
buffered out to a temporary file first similar to the s3native driver.
> Due to the lack of native rename() for S3, renaming extremely large files or directories
make take a while. Unfortunately, there is no way to notify hadoop that progress is still
being made for rename operations, so your job may time out unless you increase the task timeout.
> This driver will fully ignore _$folder$ files. This was necessary so that it could interoperate
with repositories that have had the s3native driver used on them, but means that it won't
recognize empty directories that s3native has been used on.
> Statistics for the filesystem may be calculated differently than the s3native filesystem.
When uploading a file, we do not count writing the temporary file on the local filesystem
towards the local filesystem's written bytes count. When renaming files, we do not count the
S3->S3 copy as read or write operations. Unlike the s3native driver, we only count bytes
written when we start the upload (as opposed to the write calls to the temporary local file).
The driver also counts read & write ops, but they are done mostly to keep from timing
out on large s3 operations.
> This is currently implemented as a FileSystem and not a AbstractFileSystem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message