hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chen He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12471) Support Swift file (> 5GB) continuious uploading where there is a failure
Date Fri, 09 Oct 2015 18:02:05 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14950877#comment-14950877

Chen He commented on HADOOP-12471:

I am not sure how swift do the chunk operation in the beginning. However, the DLO flag will
be added once all chunks are successfully uploaded. If there is failure, the DLO flag is not
created, then, there are leftovers. 

I assume swift does not know how many chunks will be if user upload a large file. If that
is a case, can we add another header flag that identifies whether this large file is succeed
or not in the begging? For example:
In the beginning, this flag will be false (or any value that can be changed later), once all
chunks get successfully uploaded, we change it to true. If there is any failure in the middle,
this flag will remain false. Any request to a file with this header which is false will be

> Support Swift file (> 5GB) continuious uploading where there is a failure
> -------------------------------------------------------------------------
>                 Key: HADOOP-12471
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12471
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs/swift
>    Affects Versions: 2.7.1
>            Reporter: Chen He
> Current Swift FileSystem supports file larger than 5GB. 
> File will be chunked as large as 4.6GB (configurable). For example, if there is a 46GB
file "foo" in swift, 
> Then the structure will look like:
> foo/000001
> foo/000002
> foo/000003
> ...
> foo/000010
> User will not see those 00000x files if they don't specify. That means, if user does:
> \> hadoop fs -ls swift://container.serviceProvidor/foo
> It only shows:
> dwr-r--r--    46GB    foo
> However, in my test, if there is a failure, during uploading the foo file, the previous
uploaded chunks will be left in the object store. It will be good to support continuous uploading
based on previous leftover

This message was sent by Atlassian JIRA

View raw message