pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From felix gao <gre1...@gmail.com>
Subject Re: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed to create DataStorage
Date Mon, 13 Dec 2010 17:37:17 GMT
Thanks for the clarification Dmitriy.  I will try that out when I got the
chance.

Felix

On Fri, Dec 10, 2010 at 8:31 PM, Dmitriy Ryaboy <dvryaboy@gmail.com> wrote:

> The CDH3 distribution has security patched in; my understanding is that
> this
> changes the protocol, and both your server and client libraries must be
> compatible.
> You don't need the Cloudera version of pig, I think, but you do need their
> version of the Hadoop jars on both sides -- so you can't take the fat
> pig.jar, but must use the pig-nohadoop.jar version, and put the Cloudera
> Hadoop jars on your classpath.
>
> -D
>
> On Fri, Dec 10, 2010 at 11:46 AM, felix gao <gre1600@gmail.com> wrote:
>
> > I just fixed the problem.
> > I am using CDH3b2.  Appearently Cloudera have their own pig distribution.
> >  THere are some major patches going on for their version of pig 0.7
> > 0011-PIG-1452-to-remove-hadoop20.jar-from-lib-and-use-had.patch
> > 0012-CLOUDERA-BUILD.-Build-pig-against-CDH3b3-snapshot.patch
> >
> > Now that I am really confused on which version to use from now.
> >
> > Thanks for the help.
> >
> > Felix
> >
> >
> > On Fri, Dec 10, 2010 at 11:30 AM, Daniel Dai <jianyong@yahoo-inc.com>
> > wrote:
> >
> > > hadoop20.jar is more than hadoop-core.jar, it includes all hadoop
> classes
> > > and dependent libraries. Where did you get hadoop? Is that from CDH?
> > which
> > > version is it?
> > >
> > >
> > > Daniel
> > >
> > > felix gao wrote:
> > >
> > >> Daniel,
> > >>
> > >> Here is what I did, the jar is already build by cloudera, so I did
> > >> mv hadoop-core-0.20.2+737.jar hadoop20.jar to pig's lib dir
> > >>
> > >> then I did
> > >> java -Dfs.default.name=hdfs://localhost:8020
> > >> -Dmapred.job.tracker=localhost:8021 -jar pig-0.7.0-core.jar
> > >> 10/12/10 14:21:42 INFO pig.Main: Logging error messages to:
> > >> /home/felix/pig-0.7.0/pig_1292008902688.log
> > >> 2010-12-10 14:21:43,014 [main] INFO
> > >>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> > >> Connecting
> > >> to hadoop file system at: hdfs://localhost:8020
> > >> 2010-12-10 14:21:43,275 [main] ERROR org.apache.pig.Main - ERROR 2999:
> > >> Unexpected internal error. Failed to create DataStorage
> > >>
> > >> seems that still doesn't fix my problem.
> > >>
> > >> Felix
> > >>
> > >>
> > >> On Fri, Dec 10, 2010 at 11:10 AM, Daniel Dai <jianyong@yahoo-inc.com>
> > >> wrote:
> > >>
> > >>
> > >>
> > >>> I didn't use Cloudera distribution before. Pig bundles Apache hadoop
> > >>> 0.20.2
> > >>> client library. If Cloudera made some changes to hadoop, that could
> be
> > an
> > >>> issue.
> > >>>
> > >>> One thing you can try is build hadoop20.jar by yourself (
> > >>> http://behemoth.strlen.net/~alex/hadoop20-pig-howto.txt<
> http://behemoth.strlen.net/%7Ealex/hadoop20-pig-howto.txt>),
> > put it in lib
> > >>> (replace the original hadoop20.jar).
> > >>>
> > >>> Daniel
> > >>>
> > >>>
> > >>> felix gao wrote:
> > >>>
> > >>>
> > >>>
> > >>>> Daniel,
> > >>>>
> > >>>> No, I am using 0.20.2 from Cloudera.
> > >>>> here is all the jar under pig's lib
> > >>>> $ ls ~/pig-0.7.0/lib
> > >>>> automaton.jar  hadoop-LICENSE.txt  hadoop-lzo.jar  hadoop18.jar
> > >>>>  hadoop20.jar  hbase-0.20.0-test.jar  hbase-0.20.0.jar  jdiff
> > >>>>  zookeeper-hbase-1329.jar
> > >>>>
> > >>>> $ ls $HADOOP_HOME
> > >>>> CHANGES.txt  build.xml      hadoop-0.20.2+737-ant.jar
> > >>>> hadoop-ant-0.20.2+737.jar       hadoop-examples.jar          ivy
> > >>>>  webapps
> > >>>> LICENSE.txt  cloudera       hadoop-0.20.2+737-core.jar
> > >>>>  hadoop-ant.jar
> > >>>>              hadoop-test-0.20.2+737.jar   ivy.xml
> > >>>> NOTICE.txt   conf           hadoop-0.20.2+737-examples.jar
> > >>>>  hadoop-core-0.20.2+737.jar      hadoop-test.jar              lib
> > >>>> README.txt   contrib        hadoop-0.20.2+737-test.jar
> > >>>>  hadoop-core.jar
> > >>>>               hadoop-tools-0.20.2+737.jar  logs
> > >>>> bin          example-confs  hadoop-0.20.2+737-tools.jar
> > >>>> hadoop-examples-0.20.2+737.jar  hadoop-tools.jar             pids
> > >>>>
> > >>>>
> > >>>> $ ls  $HADOOP_HOME/lib
> > >>>> aspectjrt-1.6.5.jar           commons-logging-api-1.0.4.jar
> > >>>>  jackson-mapper-asl-1.5.2.jar  junit-4.5.jar
> > >>>> servlet-api-2.5-6.1.14.jar
> > >>>> aspectjtools-1.6.5.jar        commons-net-1.4.1.jar
> > >>>>  jasper-compiler-5.5.12.jar    kfs-0.2.2.jar
> > >>>> slf4j-api-1.4.3.jar
> > >>>> commons-cli-1.2.jar           core-3.1.1.jar
> > >>>> jasper-runtime-5.5.12.jar     kfs-0.2.LICENSE.txt
> > >>>> slf4j-log4j12-1.4.3.jar
> > >>>> commons-codec-1.4.jar         hadoop-fairscheduler-0.20.2+737.jar
> >  jdiff
> > >>>>                   log4j-1.2.15.jar
>  xmlenc-0.52.jar
> > >>>> commons-daemon-1.0.1.jar      hadoop-lzo-0.4.6.jar
> > >>>> jets3t-0.6.1.jar              mockito-all-1.8.2.jar
> > >>>> commons-el-1.0.jar            hsqldb-1.8.0.10.LICENSE.txt
> > >>>>  jetty-6.1.14.jar              mysql-connector-java-5.0.8-bin.jar
> > >>>> commons-httpclient-3.0.1.jar  hsqldb-1.8.0.10.jar
> > >>>>  jetty-util-6.1.14.jar         native
> > >>>> commons-logging-1.0.4.jar     jackson-core-asl-1.5.2.jar
> > >>>> jsp-2.1
> > >>>>                   oro-2.0.8.jar
> > >>>>
> > >>>> please tell me how to get this working with pig
> > >>>>
> > >>>> Thanks,
> > >>>>
> > >>>> Felix
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>> On Fri, Dec 10, 2010 at 12:20 AM, Daniel Dai <daijyc@gmail.com>
> > wrote:
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>> Looks like hadoop client jar does not match the version of
server
> > side.
> > >>>>> Are
> > >>>>> you using hadoop 0.20.2 from Apache?
> > >>>>>
> > >>>>> Daniel
> > >>>>>
> > >>>>> -----Original Message----- From: felix gao
> > >>>>> Sent: Thursday, December 09, 2010 5:48 PM
> > >>>>> To: pig-user@hadoop.apache.org
> > >>>>> Subject: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and
> Failed
> > to
> > >>>>> create DataStorage
> > >>>>>
> > >>>>>
> > >>>>> I kept seening Failed to create DataStroage error when try
to run
> pig
> > >>>>>
> > >>>>> $ java -cp pig-0.7.0-core.jar:$HADOOP_CONF_DIR org.apache.pig.Main
> -x
> > >>>>> mapreduce
> > >>>>> 10/12/09 20:35:31 INFO pig.Main: Logging error messages to:
> > >>>>> /home/testpig/pig-0.7.0/pig_1291944931735.log
> > >>>>> 2010-12-09 20:35:31,997 [main] INFO
> > >>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine
-
> > >>>>> Connecting
> > >>>>> to hadoop file system at: hdfs://localhost:8020
> > >>>>> 2010-12-09 20:35:32,333 [main] ERROR org.apache.pig.Main -
ERROR
> > 2999:
> > >>>>> Unexpected internal error. Failed to create DataStorage
> > >>>>>
> > >>>>> $ cat pig_1291944931735.log
> > >>>>> Error before Pig is launched
> > >>>>> ----------------------------
> > >>>>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
> > >>>>>
> > >>>>> java.lang.RuntimeException: Failed to create DataStorage
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:216)
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:126)
> > >>>>> at org.apache.pig.impl.PigContext.connect(PigContext.java:184)
> > >>>>> at org.apache.pig.PigServer.<init>(PigServer.java:184)
> > >>>>> at org.apache.pig.PigServer.<init>(PigServer.java:173)
> > >>>>> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54)
> > >>>>> at org.apache.pig.Main.main(Main.java:354)
> > >>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020
> > failed
> > >>>>> on
> > >>>>> local exception: java.io.EOFException
> > >>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
> > >>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743)
> > >>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
> > >>>>> at $Proxy0.getProtocolVersion(Unknown Source)
> > >>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
> > >>>>> at
> > >>>>>
> > org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
> > >>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
> > >>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
> > >>>>> at
> > >>>>>
> > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
> > >>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
> > >>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
> > >>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
> > >>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
> > >>>>> at
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
> > >>>>> ... 8 more
> > >>>>> Caused by: java.io.EOFException
> > >>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375)
> > >>>>> at
> > >>>>>
> > >>>>>
> > org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
> > >>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
> > >>>>>
> > >>>>> if I ran java -cp pig-0.7.0-core.jar org.apache.pig.Main -x
> mapreduce
> > >>>>> command, I can atleast see the grunt shell.
> > >>>>>
> > >>>>> However, when using hadoop commands
> > >>>>> $ hadoop fs -ls
> > >>>>> Found 1 items
> > >>>>> -rw-r--r--   1 testpig supergroup     454557 2010-12-09 19:31
> > >>>>> /user/testpig/access_log.2010-08-30-23-01.lzo
> > >>>>>
> > >>>>> everything seems to be fine connecting to hdfs.
> > >>>>>
> > >>>>> My environment have the following settings
> > >>>>> PIG_HOME=/home/testpig/pig-0.7.0
> > >>>>> HADOOP_HOME=/usr/lib/hadoop-0.20    (cloudera distribution)
> > >>>>> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20/conf
> > >>>>> JAVA_HOME=/usr/java/default
> > >>>>>
> > >>>>> pig-env.sh have the following setting
> > >>>>> export PIG_OPTS="$PIG_OPTS
> > >>>>> -Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64"
> > >>>>> export
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> >
> PIG_CLASSPATH=$PIG_CLASSPATH:/home/testpig/hadoop-lzo.jar:/home/testpig/elephant-bird.jar:/home/testpig/elephant-bird/lib/*
> > >>>>> export PIG_HADOOP_VERSION=20
> > >>>>>
> > >>>>>
> > >>>>> What is going on there?
> > >>>>>
> > >>>>> Thanks a lot.
> > >>>>>
> > >>>>> Felix
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message