hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Biswajit Nayak <biswa...@altiscale.com>
Subject Re: Hive Cli ORC table read error with limit option
Date Fri, 04 Mar 2016 11:03:55 GMT
Any one has any idea about this.. Really stuck with this.

On Tue, Mar 1, 2016 at 4:09 PM, Biswajit Nayak <biswajit@altiscale.com>
wrote:

> Hi,
>
> It works for MR engine, while in TEZ it fails.
>
> *hive> set hive.execution.engine=tez;*
>
> *hive> set hive.fetch.task.conversion=none;*
>
> *hive> select h from test*db.table_orc* where year = 2016 and month =1
> and day >29 limit 10;*
>
> *Query ID = 26f9a510-c10c-475c-9988-081998b66b0c*
>
> *Total jobs = 1*
>
> *Launching Job 1 out of 1*
>
>
>
> *Status: Running (Executing on YARN cluster with App id
> application_1456379707708_1135)*
>
>
>
> *--------------------------------------------------------------------------------*
>
> *        VERTICES      STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED
> KILLED*
>
>
> *--------------------------------------------------------------------------------*
>
> *Map 1                 FAILED     -1          0        0       -1       0
>       0*
>
>
> *--------------------------------------------------------------------------------*
>
> *VERTICES: 00/01  [>>--------------------------] 0%    ELAPSED TIME: 0.37
> s     *
>
>
> *--------------------------------------------------------------------------------*
>
> *Status: Failed*
>
> *Vertex failed, vertexName=Map 1, vertexId=vertex_1456379707708_1135_1_00,
> diagnostics=[Vertex vertex_1456379707708_1135_1_00 [Map 1] killed/failed
> due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: t*able_orc* initializer
> failed, vertex=vertex_1456379707708_1135_1_00 [Map 1],
> java.lang.RuntimeException: serious problem*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*
>
> * at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*
>
> * at java.security.AccessController.doPrivileged(Native Method)*
>
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*
>
> * at java.util.concurrent.FutureTask.run(FutureTask.java:262)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
>
> * at java.lang.Thread.run(Thread.java:744)*
>
> *Caused by: java.util.concurrent.ExecutionException:
> java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.concurrent.FutureTask.report(FutureTask.java:122)*
>
> * at java.util.concurrent.FutureTask.get(FutureTask.java:188)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*
>
> * ... 15 more*
>
> *Caused by: java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.Collections$EmptyList.get(Collections.java:3212)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*
>
> * ... 4 more*
>
> *]*
>
> *DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
> killedVertices:0*
>
> *FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map
> 1, vertexId=vertex_1456379707708_1135_1_00, diagnostics=[Vertex
> vertex_1456379707708_1135_1_00 [Map 1] killed/failed due
> to:ROOT_INPUT_INIT_FAILURE, Vertex Input: *table_orc* initializer failed,
> vertex=vertex_1456379707708_1135_1_00 [Map 1], java.lang.RuntimeException:
> serious problem*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1021)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getSplits(OrcInputFormat.java:1048)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.addSplitsForGroup(HiveInputFormat.java:306)*
>
> * at
> org.apache.hadoop.hive.ql.io.HiveInputFormat.getSplits(HiveInputFormat.java:408)*
>
> * at
> org.apache.hadoop.hive.ql.exec.tez.HiveSplitGenerator.initialize(HiveSplitGenerator.java:131)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:245)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable$1.run(RootInputInitializerManager.java:239)*
>
> * at java.security.AccessController.doPrivileged(Native Method)*
>
> * at javax.security.auth.Subject.doAs(Subject.java:415)*
>
> * at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:239)*
>
> * at
> org.apache.tez.dag.app.dag.RootInputInitializerManager$InputInitializerCallable.call(RootInputInitializerManager.java:226)*
>
> * at java.util.concurrent.FutureTask.run(FutureTask.java:262)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)*
>
> * at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)*
>
> * at java.lang.Thread.run(Thread.java:744)*
>
> *Caused by: java.util.concurrent.ExecutionException:
> java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.concurrent.FutureTask.report(FutureTask.java:122)*
>
> * at java.util.concurrent.FutureTask.get(FutureTask.java:188)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.generateSplitsInfo(OrcInputFormat.java:1016)*
>
> * ... 15 more*
>
> *Caused by: java.lang.IndexOutOfBoundsException: Index: 0*
>
> * at java.util.Collections$EmptyList.get(Collections.java:3212)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$Type.getSubtypes(OrcProto.java:12240)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getColumnIndicesFromNames(ReaderImpl.java:651)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.getRawDataSizeOfColumns(ReaderImpl.java:634)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.populateAndCacheStripeDetails(OrcInputFormat.java:927)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:836)*
>
> * at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$SplitGenerator.call(OrcInputFormat.java:702)*
>
> * ... 4 more*
>
> *]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
> killedVertices:0*
>
> *hive> *
>
>
> On Tue, Mar 1, 2016 at 1:09 PM, Biswajit Nayak <biswajit@altiscale.com>
> wrote:
>
>> Gopal,
>>
>> Any plan of provide the fix to Hive 1.x versions or to backport it?
>>
>> Regards
>> Biswa
>>
>> On Tue, Mar 1, 2016 at 11:44 AM, Biswajit Nayak <biswajit@altiscale.com>
>> wrote:
>>
>>> Thanks Gopal for the details .. happy to know it has been counted and
>>> fixed.
>>>
>>> Biswa
>>>
>>>
>>> On Tue, Mar 1, 2016 at 11:37 AM, Gopal Vijayaraghavan <gopalv@apache.org
>>> > wrote:
>>>
>>>>
>>>> > Yes it is kerberos cluster.
>>>> ...
>>>> > After disabling the optimization in hive cli, it works with limit
>>>> >option.
>>>>
>>>> Alright, then it is fixed in -
>>>> https://issues.apache.org/jira/browse/HIVE-13120
>>>>
>>>>
>>>> Cheers,
>>>> Gopal
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Mime
View raw message