hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-11584) s3a file block size set to 0 in getFileStatus
Date Tue, 17 Feb 2015 14:16:13 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-11584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324214#comment-14324214
] 

Steve Loughran commented on HADOOP-11584:
-----------------------------------------

Tested patch -003 & all hadoop-aws tests against S3 EU tests multiple times', passed.

One test failed *once*'; assume a transient network glitch, as it never re-occurred. 
{code}
Running org.apache.hadoop.fs.s3a.TestS3AFileSystemContract
Tests run: 43, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 356.841 sec <<<
FAILURE! - in org.apache.hadoop.fs.s3a.TestS3AFileSystemContract
testRenameRootDirForbidden(org.apache.hadoop.fs.s3a.TestS3AFileSystemContract)  Time elapsed:
2.492 sec  <<< FAILURE!
junit.framework.AssertionFailedError: Source exists expected:<true> but was:<false>
	at junit.framework.Assert.fail(Assert.java:57)
	at junit.framework.Assert.failNotEquals(Assert.java:329)
	at junit.framework.Assert.assertEquals(Assert.java:78)
	at junit.framework.Assert.assertEquals(Assert.java:174)
	at junit.framework.TestCase.assertEquals(TestCase.java:333)
	at org.apache.hadoop.fs.FileSystemContractBaseTest.rename(FileSystemContractBaseTest.java:490)
	at org.apache.hadoop.fs.FileSystemContractBaseTest.testRenameRootDirForbidden(FileSystemContractBaseTest.java:598)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at junit.framework.TestCase.runTest(TestCase.java:176)
	at junit.framework.TestCase.runBare(TestCase.java:141)
	at junit.framework.TestResult$1.protect(TestResult.java:122)
	at junit.framework.TestResult.runProtected(TestResult.java:142)
	at junit.framework.TestResult.run(TestResult.java:125)
	at junit.framework.TestCase.run(TestCase.java:129)
	at junit.framework.TestSuite.runTest(TestSuite.java:255)
	at junit.framework.TestSuite.run(TestSuite.java:250)
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:84)
	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:264)
	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:153)
	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:124)
	at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:200)
	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:153)
	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
{code}

That assert is from a {{FileSystem.exists(path)}} call, which does call {{getFileStatus(path)}},
so is following a codepath that has now changed. But the exists() call should only fail if
the {{getFileStatus(path)}} raises a 404, and it never re-occurred. Hence the assumption:
**transient**

> s3a file block size set to 0 in getFileStatus
> ---------------------------------------------
>
>                 Key: HADOOP-11584
>                 URL: https://issues.apache.org/jira/browse/HADOOP-11584
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/s3
>    Affects Versions: 2.6.0
>            Reporter: Dan Hecht
>            Assignee: Brahma Reddy Battula
>            Priority: Blocker
>         Attachments: HADOOP-10584-003.patch, HADOOP-111584.patch, HADOOP-11584-002.patch
>
>
> The consequence is that mapreduce probably is not splitting s3a files in the expected
way. This is similar to HADOOP-5861 (which was for s3n, though s3n was passing 5G rather than
0 for block size).
> FileInputFormat.getSplits() relies on the FileStatus block size being set:
> {code}
>         if (isSplitable(job, path)) {
>           long blockSize = file.getBlockSize();
>           long splitSize = computeSplitSize(blockSize, minSize, maxSize);
> {code}
> However, S3AFileSystem does not set the FileStatus block size field. From S3AFileStatus.java:
> {code}
>   // Files
>   public S3AFileStatus(long length, long modification_time, Path path) {
>     super(length, false, 1, 0, modification_time, path);
>     isEmptyDirectory = false;
>   }
> {code}
> I think it should use S3AFileSystem.getDefaultBlockSize() for each file's block size
(where it's currently passing 0).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message