hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhang Xiaoyu <zhangxiaoyu...@gmail.com>
Subject generating ORC file as output of a mapreduce job
Date Wed, 20 Nov 2013 02:11:22 GMT
Hi,
I am writing a MR job to generate data for Hive.

the code generates output with Text format pretty OK

job.setOutputKeyClass(NullWritable.class);

job.setOutputValueClass(Text.class);


But when I change the value class from Text.class to OrcOutputFormat.class,
it throw exception


2013-11-20 00:50:50,613 FATAL [main]
org.apache.hadoop.mapred.YarnChild: Error running child :
java.lang.VerifyError: class
org.apache.hadoop.ipc.protobuf.RpcHeaderProtos$RpcRequestHeaderProto
overrides final method
getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet;
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(ClassLoader.java:791)
	at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
	at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
	at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:423)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:356)
	at org.apache.hadoop.util.ProtoUtil.makeRpcRequestHeader(ProtoUtil.java:165)
	at org.apache.hadoop.ipc.Client$Connection.<init>(Client.java:362)
	at org.apache.hadoop.ipc.Client.getConnection(Client.java:1389)
	at org.apache.hadoop.ipc.Client.call(Client.java:1318)
	at org.apache.hadoop.ipc.Client.call(Client.java:1300)
	at org.apache.hadoop.ipc.WritableRpcEngine$Invoker.invoke(WritableRpcEngine.java:231)
	at sun.proxy.$Proxy6.getTask(Unknown Source)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:133)





My objective is generating ORC file as output a MR job, so that I can
load data into Hive directly. If other approach also serve the same
objective, that will be nice. Is there any HCatlog utility I can use
do it ?


Thanks a lot,

Johnny

Mime
View raw message