hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shunsuke Mikami <shun0...@gmail.com>
Subject An error occurred while using RCFile on S3
Date Tue, 22 Mar 2011 14:27:57 GMT
Hi all,

I am testing RCFile on S3.
I could execute queries which don't specify columns such as "select * from
table".
But, I could not execute queries which specify columns such as "select id
from table".

This job progress to near the end of a map task, but cannot finish the task
as the below log message.

2011-03-22 17:12:04,325 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50000365'
2011-03-22 17:12:04,362 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50458664'
2011-03-22 17:12:04,444 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50603753'
2011-03-22 17:12:04,509 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50651845'
2011-03-22 17:12:04,536 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50735249'
2011-03-22 17:12:04,570 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '50956751'
2011-03-22 17:12:04,600 INFO
org.apache.hadoop.fs.s3native.NativeS3FileSystem: Opening key
'user/hive/warehouse/rcfile_logs/dt=20110312/controller=recipe/000001_0' for
reading at position '51025754'
2011-03-22 17:12:04,633 INFO org.apache.hadoop.hive.ql.exec.MapOperator: 9
finished. closing...
...
2011-03-22 17:12:05,167 WARN org.apache.hadoop.mapred.Child: Error running
child org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 GET failed for
'/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
XML Error Message: <?xml version="1.0"
encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The
requested
range is not
satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>***</RequestId><HostId>***</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error>
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:229)
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceException(Jets3tNativeFileSystemStore.java:220)
at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:133)
at sun.reflect.GeneratedMethodAccessor14.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597) at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
at org.apache.hadoop.fs.s3native.$Proxy1.retrieve(Unknown Source) at
org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:150)
at
org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:76)
at
org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:56)
at java.io.DataInputStream.skipBytes(DataInputStream.java:203) at
org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:443)
at
org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1304)
at
org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1425)
at
org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:88)
at
org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:39)
at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:66)
at
org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:32)
at
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:67)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:208)
at
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:193)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) at
org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:390) at
org.apache.hadoop.mapred.MapTask.run(MapTask.java:324) at
org.apache.hadoop.mapred.Child$4.run(Child.java:240) at
java.security.AccessController.doPrivileged(Native Method) at
javax.security.auth.Subject.doAs(Subject.java:396) at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
at org.apache.hadoop.mapred.Child.main(Child.java:234) Caused by:
org.jets3t.service.S3ServiceException: S3 GET failed for
'/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
XML Error Message: <?xml version="1.0"
encoding="UTF-8"?><Error><Code>InvalidRange</Code><Message>The
requested
range is not
satisfiable</Message><ActualObjectSize>51025754</ActualObjectSize><RequestId>4E5BD7E6D94DBA1B</RequestId><HostId>l+oM6yDUt+MbQgDB4pzcGckUQ1E7pbaUGy26yuTqNE4Gn+FdiJIA6u4VvsQl2+aR</HostId><RangeRequested>bytes=51025754-</RangeRequested></Error>
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.performRequest(RestS3Service.java:424)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.performRestGet(RestS3Service.java:686)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1558)
at
org.jets3t.service.impl.rest.httpclient.RestS3Service.getObjectImpl(RestS3Service.java:1501)
at org.jets3t.service.S3Service.getObject(S3Service.java:1876) at
org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:129)
... 28 more 2011-03-22 17:12:05,170 INFO org.apache.hadoop.mapred.Task:
Runnning cleanup for the task


The client seems to request to the invalid range like this error code shows.

S3 GET failed for
'/user%2Fhive%2Fwarehouse%2Frcfile_logs%2Fdt%3D20110312%2Fcontroller%3Drecipe%2F000001_0'
XML Error Message: <?xml version="1.0" encoding="UTF-8"?><Error>
<Code>InvalidRange</Code>
<Message>The requested range is not satisfiable</Message>
<ActualObjectSize>51025754</ActualObjectSize>
<RequestId>***</RequestId>
<HostId>***</HostId>
<RangeRequested>bytes=51025754-</RangeRequested></Error>

This error did not occur on HDFS, so I guess this is a bug.
Or is there a person was able to run queries using RCFile on S3?

Thanks,
-- 
Shusuke Mikami
shun0102@gmail.com

Mime
View raw message