hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Nauroth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-12837) FileStatus.getModificationTime not working on S3
Date Wed, 24 Feb 2016 17:52:18 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163409#comment-15163409

Chris Nauroth commented on HADOOP-12837:

Hello [~jagdishk].

Is this referring to a directory or a file?  If it's a directory, then s3n always returns
0 for mtime.  This is also true of s3a.  I don't believe there are currently any plans in
progress to change this behavior.

The expected atomicity semantics of implementing directory mtime are more challenging to implement
against a blob store compared to a traditional file system or HDFS.  If a new file or sub-directory
gets created under a directory, then users have an expectation that the corresponding update
to mtime at the parent folder is atomic with respect to the file/directory creation operation.
 On HDFS, we can take a central lock at the NameNode to do all of the metadata manipulations
as a transaction.  For a blob store, this is multiple HTTP operations on different blob keys,
and those multiple operations do not execute as an atomic transaction.

The Azure file system does provide mtime on directories, but it does not provide atomicity
of the mtime updates.  (I just mention this to demonstrate that the behavior is not always
consistent across different file system implementations.)

> FileStatus.getModificationTime not working on S3
> ------------------------------------------------
>                 Key: HADOOP-12837
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12837
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: fs/s3
>            Reporter: Jagdish Kewat
> Hi Team,
> We have observed an issue with the FileStatus.getModificationTime() API on S3 filesystem.
The method always returns 0.
> I googled for this however couldn't find any solution as such which would fit in my scheme
of things. S3FileStatus seems to be an option however I would be using this API on HDFS as
well as S3 both so can't go for it.
> I tried to run the job on:
> * Release label:emr-4.2.0
> * Hadoop distribution:Amazon 2.6.0
> * Hadoop Common jar: hadoop-common-2.6.0.jar
> Please advise if any patch or fix available for this.
> Thanks,
> Jagdish

This message was sent by Atlassian JIRA

View raw message