hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ian Nowland (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-5836) Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs to fail
Date Wed, 27 May 2009 19:51:45 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ian Nowland updated HADOOP-5836:
--------------------------------

    Attachment: HADOOP-5836-1.patch

* Yes, I have run Jets3tNativeS3FileSystemContractTest. Multiple Times in fact, including
for the newest patch :).
* I have reworked logging, making everything debug, except the following:
+      LOG.info("Opening key '" + key + "' for reading at position '" + pos + "'");
+      LOG.info("OutputStream for key '" + key + "' writing to tempfile '" + this.backupFile
+ "'");
+      LOG.info("OutputStream for key '" + key + "' closed. Now beginning upload");
+      LOG.info("OutputStream for key '" + key + "' upload complete");
+    LOG.info("Opening '" + f + "' for reading");

The basic idea is I want to always capture in a tasks syslog what S3 files it is reading from
as this is very useful when a subset of tasks fail. Also I wanted to capture the time spent
in actually uploading the file to S3 very specifically.

* Good catch - must have happened as part of the diff I did ignoring whitespace. I have now
gone through with a fine tooth comb and fixed all indentation issues I could see.

* Done


> Bug in S3N handling of directory markers using an object with a trailing "/" causes jobs
to fail
> ------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-5836
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5836
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: fs/s3
>    Affects Versions: 0.18.3
>            Reporter: Ian Nowland
>            Assignee: Ian Nowland
>             Fix For: 0.21.0
>
>         Attachments: HADOOP-5836-0.patch, HADOOP-5836-1.patch
>
>
> Some tools which upload to S3 and use a object terminated with a "/" as a directory marker,
for instance "s3n://mybucket/mydir/". If asked to iterate that "directory" via listStatus(),
then the current code will return an empty file "", which the InputFormatter happily assigns
to a split, and which later causes a task to fail, and probably the job to fail. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message