hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bryan Jeffrey <bryan.jeff...@gmail.com>
Subject Re: Hive - Issue Converting Text to Orc
Date Tue, 24 Dec 2013 16:39:28 GMT
Hello.

I posted this a few weeks ago, but was unable to get a response that solved
the issue.  I have made no headway in the mean time.  I was hoping that if
I re-summarized the issue that someone would have some advice regarding
this problem.
Running the following version of Hadoop: hadoop-2.2.0
Running the following version of Hive: hive-0.12.0

I have a simple test system setup with (2) datanodes/node manager and (1)
namenode/resource manager.  Hive is running on the namenode, and contacting
a MySQL database for metastore.

I have created a small table 'from_text' as follows:

[server:10001] hive> describe from_text;
foo                     int                     None
bar                     int                     None
boo                     string                  None


[server:10001] hive> select * from from_text;
1       2       Hello
2       3       World

I go to insert the data into my Orc table, 'orc_table':

[server:10001] hive> describe orc_test;
foo                     int                     from deserializer
bar                     int                     from deserializer
boo                     string                  from deserializer


The job runs, but fails to complete with the following errors (see below).
 This seems to be the exact example covered in the example here:

http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/

The error output is below.  I have tried several things to solve the issue,
including re-installing Hive 0.12.0 from binary install.

Help?

[server:10001] hive> insert into table orc_test select * from from_text;
[Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask


Diagnostic Messages for this Task:
Error: java.lang.RuntimeException: Hive Runtime Error while closing
operators
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
Caused by: java.lang.UnsupportedOperationException: This is supposed to be
overridden by subclasses.
        at
com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
        at
com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
        at
com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
        at org.apache.hadoop.hive.ql.io.orc
.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
        at
com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl.flushStripe(WriterImpl.java:1699)
        at org.apache.hadoop.hive.ql.io.orc
.WriterImpl.close(WriterImpl.java:1868)
        at org.apache.hadoop.hive.ql.io.orc
.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
        at
org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
        at
org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
        ... 8 more


On Tue, Dec 17, 2013 at 11:56 AM, Bryan Jeffrey <bryan.jeffrey@gmail.com>wrote:

> Prasanth,
>
> I downloaded the binary Hive version from the URL you specified.  I
> untarred the Hive tar, copied in configuration files, and started Hive.  I
> continue to see the same error:
>
> [server:10001] hive> describe orc_test;
> foo                     int                     from deserializer
> bar                     int                     from deserializer
> boo                     string                  from deserializer
>
>
> [server:10001] hive> describe from_text;
> foo                     int                     None
> bar                     int                     None
> boo                     string                  None
>
> [server:10001] hive> select * from from_text;
> 1       2       Hello
> 2       3       World
>
> [server:10001] hive> insert into table orc_test select * from from_text;
> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>
> From the Hive Log:
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: Hive Runtime Error while closing
> operators
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:396)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
> Caused by: java.lang.UnsupportedOperationException: This is supposed to be
> overridden by subclasses.
>         at
> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>         at
> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>         at
> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>         at
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>         at
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>         at
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>         at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>         ... 8 more
>
>
>
>
>
> On Tue, Dec 17, 2013 at 2:31 AM, Prasanth Jayachandran <
> pjayachandran@hortonworks.com> wrote:
>
>> Bryan
>>
>> In either cases (source download or binary download) you do not need to
>> compile orc protobuf component. The java source from .proto files should be
>> already available when you download hive 0.12 release. I would recommend
>> re-downloading hive 0.12 binary release from
>> http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/ and running
>> hive directly. After extracting the hive-0.12.0-bin.tar.gz<http://mirror.symnds.com/software/Apache/hive/hive-0.12.0/hive-0.12.0-bin.tar.gz>
set
>> HIVE_HOME to the extracted directory and run hive. Let me know if you face
>> any issues.
>>
>> Thanks
>> Prasanth Jayachandran
>>
>> On Dec 16, 2013, at 5:19 PM, Bryan Jeffrey <bryan.jeffrey@gmail.com>
>> wrote:
>>
>> Prasanth,
>>
>> I simply compiled the protobuf library, and then compiled the orc
>> protobuf component.  I did not recompile either Hive or custom UDFs/etc.
>>
>> Is a protobuf recompile the solution for this issue, or a dead end?  Has
>> this been seen before?  I looked for more feedback, but most of the Orc
>> issues were associated with Hive 0.11.0.
>>
>> I will try recompiling the 2.4 protobuf version shortly!
>>
>> Bryan
>>
>>
>> On Mon, Dec 16, 2013 at 8:02 PM, Prasanth Jayachandran <
>> pjayachandran@hortonworks.com> wrote:
>>
>>> Also what are you doing with steps 2 through 5? Compiling hive or your
>>> custom code?
>>>
>>> Thanks
>>> Prasanth Jayachandran
>>>
>>> On Dec 16, 2013, at 4:55 PM, Bryan Jeffrey <bryan.jeffrey@gmail.com>
>>> wrote:
>>>
>>> Prasanth,
>>>
>>> I am running Hive 0.12.0 downloaded from the Apache Hive site.  I did
>>> not compile it.  I downloaded protobuf 2.5.0 earlier today from the Google
>>> Code site.  I compiled it via the following steps:
>>> (1) ./configure && make (to compile the C code)
>>> (2) protoc --java_out=src/main/java -I../src ../src/google/protobuf/descriptor.proto
>>> ../src/google/protobuf/orc.proto
>>> (3) Compiled the org/apache/... directory via javac
>>> (4) Created jar via jar -cf protobuf-java-2.4.1.jar org
>>> (5) Copied my protobuf-java-2.4.1.jar over the one in hive-0.12.0/lib
>>> (6) Restarted hive
>>>
>>> Same results before/after protobuf modification.
>>>
>>> Bryan
>>>
>>>
>>> On Mon, Dec 16, 2013 at 7:34 PM, Prasanth Jayachandran <
>>> pjayachandran@hortonworks.com> wrote:
>>>
>>>> What version of protobuf are you using? Are you compiling hive from
>>>> source?
>>>>
>>>>  Thanks
>>>> Prasanth Jayachandran
>>>>
>>>> On Dec 16, 2013, at 4:30 PM, Bryan Jeffrey <bryan.jeffrey@gmail.com>
>>>> wrote:
>>>>
>>>>   Hello.
>>>>
>>>> Running the following version of Hadoop: hadoop-2.2.0
>>>> Running the following version of Hive: hive-0.12.0
>>>>
>>>> I have a simple test system setup with (2) datanodes/node manager and
>>>> (1) namenode/resource manager.  Hive is running on the namenode, and
>>>> contacting a MySQL database for metastore.
>>>>
>>>> I have created a small table 'from_text' as follows:
>>>>
>>>> [server:10001] hive> describe from_text;
>>>> foo                     int                     None
>>>> bar                     int                     None
>>>> boo                     string                  None
>>>>
>>>>
>>>> [server:10001] hive> select * from from_text;
>>>> 1       2       Hello
>>>> 2       3       World
>>>>
>>>> I go to insert the data into my Orc table, 'orc_table':
>>>>
>>>> [server:10001] hive> describe orc_test;
>>>> foo                     int                     from deserializer
>>>> bar                     int                     from deserializer
>>>> boo                     string                  from deserializer
>>>>
>>>>
>>>> The job runs, but fails to complete with the following errors (see
>>>> below).  This seems to be the exact example covered in the example here:
>>>>
>>>>
>>>> http://hortonworks.com/blog/orcfile-in-hdp-2-better-compression-better-performance/
>>>>
>>>> I took a few minutes to recompile the protbuf library as several other
>>>> problems mentioned that Hive 0.12 did not have the protobuf library
>>>> updated. That did not remedy the problem.  Any ideas?
>>>>
>>>>
>>>> [server:10001] hive> insert into table orc_test select * from from_text;
>>>> [Hive Error]: Query returned non-zero code: 2, cause: FAILED: Execution
>>>> Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>
>>>>
>>>> Diagnostic Messages for this Task:
>>>> Error: java.lang.RuntimeException: Hive Runtime Error while closing
>>>> operators
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:240)
>>>>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
>>>>         at
>>>> org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
>>>>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
>>>>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:162)
>>>>         at java.security.AccessController.doPrivileged(Native Method)
>>>>         at javax.security.auth.Subject.doAs(Subject.java:396)
>>>>         at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491)
>>>>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:157)
>>>> Caused by: java.lang.UnsupportedOperationException: This is supposed to
>>>> be overridden by subclasses.
>>>>         at
>>>> com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$ColumnStatistics.getSerializedSize(OrcProto.java:3046)
>>>>          at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndexEntry.getSerializedSize(OrcProto.java:4129)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSizeNoTag(CodedOutputStream.java:749)
>>>>         at
>>>> com.google.protobuf.CodedOutputStream.computeMessageSize(CodedOutputStream.java:530)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcProto$RowIndex.getSerializedSize(OrcProto.java:4641)
>>>>         at
>>>> com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$TreeWriter.writeStripe(WriterImpl.java:548)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl$StructTreeWriter.writeStripe(WriterImpl.java:1328)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.flushStripe(WriterImpl.java:1699)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.WriterImpl.close(WriterImpl.java:1868)
>>>>         at
>>>> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.close(OrcOutputFormat.java:95)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator$FSPaths.closeWriters(FileSinkOperator.java:181)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.FileSinkOperator.closeOp(FileSinkOperator.java:866)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:596)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:613)
>>>>         at
>>>> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.close(ExecMapper.java:207)
>>>>         ... 8 more
>>>>
>>>>
>>>>
>>>> CONFIDENTIALITY NOTICE
>>>> NOTICE: This message is intended for the use of the individual or
>>>> entity to which it is addressed and may contain information that is
>>>> confidential, privileged and exempt from disclosure under applicable law.
>>>> If the reader of this message is not the intended recipient, you are hereby
>>>> notified that any printing, copying, dissemination, distribution,
>>>> disclosure or forwarding of this communication is strictly prohibited. If
>>>> you have received this communication in error, please contact the sender
>>>> immediately and delete it from your system. Thank You.
>>>
>>>
>>>
>>>
>>> CONFIDENTIALITY NOTICE
>>> NOTICE: This message is intended for the use of the individual or entity
>>> to which it is addressed and may contain information that is confidential,
>>> privileged and exempt from disclosure under applicable law. If the reader
>>> of this message is not the intended recipient, you are hereby notified that
>>> any printing, copying, dissemination, distribution, disclosure or
>>> forwarding of this communication is strictly prohibited. If you have
>>> received this communication in error, please contact the sender immediately
>>> and delete it from your system. Thank You.
>>>
>>
>>
>>
>> CONFIDENTIALITY NOTICE
>> NOTICE: This message is intended for the use of the individual or entity
>> to which it is addressed and may contain information that is confidential,
>> privileged and exempt from disclosure under applicable law. If the reader
>> of this message is not the intended recipient, you are hereby notified that
>> any printing, copying, dissemination, distribution, disclosure or
>> forwarding of this communication is strictly prohibited. If you have
>> received this communication in error, please contact the sender immediately
>> and delete it from your system. Thank You.
>>
>
>

Mime
View raw message