hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ravuri, Venkata Puneet" <vrav...@ea.com>
Subject Seek behavior difference between NativeS3FsInputStream and DFSInputStream
Date Fri, 31 Oct 2014 10:23:41 GMT

I noticed a difference in behavior while seeking a given file present in S3 using NativeS3FileSystem$NativeS3FsInputStream
and the file present in HDFS using DFSInputStream.

If we seek to the end of the file incase of NativeS3FsInputStream, it fails with exception
"java.io.EOFException: Attempted to seek or read past the end of the file".
That is because a getObject request is issued on the S3 object with range start as value of
length of file.

The end of file case is being handled safely in DFSInputStream.
Shouldn't NativeS3FsInputStream also have similar checks?

This issue is causing errors when Hive is trying to read S3 files.
Please advise.

Thanks and Regards,

View raw message