pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From felix gao <gre1...@gmail.com>
Subject Re: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed to create DataStorage
Date Fri, 10 Dec 2010 19:46:11 GMT
I just fixed the problem.
I am using CDH3b2.  Appearently Cloudera have their own pig distribution.
 THere are some major patches going on for their version of pig 0.7
0011-PIG-1452-to-remove-hadoop20.jar-from-lib-and-use-had.patch
0012-CLOUDERA-BUILD.-Build-pig-against-CDH3b3-snapshot.patch

Now that I am really confused on which version to use from now.

Thanks for the help.

Felix


On Fri, Dec 10, 2010 at 11:30 AM, Daniel Dai <jianyong@yahoo-inc.com> wrote:

> hadoop20.jar is more than hadoop-core.jar, it includes all hadoop classes
> and dependent libraries. Where did you get hadoop? Is that from CDH? which
> version is it?
>
>
> Daniel
>
> felix gao wrote:
>
>> Daniel,
>>
>> Here is what I did, the jar is already build by cloudera, so I did
>> mv hadoop-core-0.20.2+737.jar hadoop20.jar to pig's lib dir
>>
>> then I did
>> java -Dfs.default.name=hdfs://localhost:8020
>> -Dmapred.job.tracker=localhost:8021 -jar pig-0.7.0-core.jar
>> 10/12/10 14:21:42 INFO pig.Main: Logging error messages to:
>> /home/felix/pig-0.7.0/pig_1292008902688.log
>> 2010-12-10 14:21:43,014 [main] INFO
>>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>> Connecting
>> to hadoop file system at: hdfs://localhost:8020
>> 2010-12-10 14:21:43,275 [main] ERROR org.apache.pig.Main - ERROR 2999:
>> Unexpected internal error. Failed to create DataStorage
>>
>> seems that still doesn't fix my problem.
>>
>> Felix
>>
>>
>> On Fri, Dec 10, 2010 at 11:10 AM, Daniel Dai <jianyong@yahoo-inc.com>
>> wrote:
>>
>>
>>
>>> I didn't use Cloudera distribution before. Pig bundles Apache hadoop
>>> 0.20.2
>>> client library. If Cloudera made some changes to hadoop, that could be an
>>> issue.
>>>
>>> One thing you can try is build hadoop20.jar by yourself (
>>> http://behemoth.strlen.net/~alex/hadoop20-pig-howto.txt), put it in lib
>>> (replace the original hadoop20.jar).
>>>
>>> Daniel
>>>
>>>
>>> felix gao wrote:
>>>
>>>
>>>
>>>> Daniel,
>>>>
>>>> No, I am using 0.20.2 from Cloudera.
>>>> here is all the jar under pig's lib
>>>> $ ls ~/pig-0.7.0/lib
>>>> automaton.jar  hadoop-LICENSE.txt  hadoop-lzo.jar  hadoop18.jar
>>>>  hadoop20.jar  hbase-0.20.0-test.jar  hbase-0.20.0.jar  jdiff
>>>>  zookeeper-hbase-1329.jar
>>>>
>>>> $ ls $HADOOP_HOME
>>>> CHANGES.txt  build.xml      hadoop-0.20.2+737-ant.jar
>>>> hadoop-ant-0.20.2+737.jar       hadoop-examples.jar          ivy
>>>>  webapps
>>>> LICENSE.txt  cloudera       hadoop-0.20.2+737-core.jar
>>>>  hadoop-ant.jar
>>>>              hadoop-test-0.20.2+737.jar   ivy.xml
>>>> NOTICE.txt   conf           hadoop-0.20.2+737-examples.jar
>>>>  hadoop-core-0.20.2+737.jar      hadoop-test.jar              lib
>>>> README.txt   contrib        hadoop-0.20.2+737-test.jar
>>>>  hadoop-core.jar
>>>>               hadoop-tools-0.20.2+737.jar  logs
>>>> bin          example-confs  hadoop-0.20.2+737-tools.jar
>>>> hadoop-examples-0.20.2+737.jar  hadoop-tools.jar             pids
>>>>
>>>>
>>>> $ ls  $HADOOP_HOME/lib
>>>> aspectjrt-1.6.5.jar           commons-logging-api-1.0.4.jar
>>>>  jackson-mapper-asl-1.5.2.jar  junit-4.5.jar
>>>> servlet-api-2.5-6.1.14.jar
>>>> aspectjtools-1.6.5.jar        commons-net-1.4.1.jar
>>>>  jasper-compiler-5.5.12.jar    kfs-0.2.2.jar
>>>> slf4j-api-1.4.3.jar
>>>> commons-cli-1.2.jar           core-3.1.1.jar
>>>> jasper-runtime-5.5.12.jar     kfs-0.2.LICENSE.txt
>>>> slf4j-log4j12-1.4.3.jar
>>>> commons-codec-1.4.jar         hadoop-fairscheduler-0.20.2+737.jar  jdiff
>>>>                   log4j-1.2.15.jar                    xmlenc-0.52.jar
>>>> commons-daemon-1.0.1.jar      hadoop-lzo-0.4.6.jar
>>>> jets3t-0.6.1.jar              mockito-all-1.8.2.jar
>>>> commons-el-1.0.jar            hsqldb-1.8.0.10.LICENSE.txt
>>>>  jetty-6.1.14.jar              mysql-connector-java-5.0.8-bin.jar
>>>> commons-httpclient-3.0.1.jar  hsqldb-1.8.0.10.jar
>>>>  jetty-util-6.1.14.jar         native
>>>> commons-logging-1.0.4.jar     jackson-core-asl-1.5.2.jar
>>>> jsp-2.1
>>>>                   oro-2.0.8.jar
>>>>
>>>> please tell me how to get this working with pig
>>>>
>>>> Thanks,
>>>>
>>>> Felix
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Dec 10, 2010 at 12:20 AM, Daniel Dai <daijyc@gmail.com> wrote:
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>> Looks like hadoop client jar does not match the version of server side.
>>>>> Are
>>>>> you using hadoop 0.20.2 from Apache?
>>>>>
>>>>> Daniel
>>>>>
>>>>> -----Original Message----- From: felix gao
>>>>> Sent: Thursday, December 09, 2010 5:48 PM
>>>>> To: pig-user@hadoop.apache.org
>>>>> Subject: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed
to
>>>>> create DataStorage
>>>>>
>>>>>
>>>>> I kept seening Failed to create DataStroage error when try to run pig
>>>>>
>>>>> $ java -cp pig-0.7.0-core.jar:$HADOOP_CONF_DIR org.apache.pig.Main -x
>>>>> mapreduce
>>>>> 10/12/09 20:35:31 INFO pig.Main: Logging error messages to:
>>>>> /home/testpig/pig-0.7.0/pig_1291944931735.log
>>>>> 2010-12-09 20:35:31,997 [main] INFO
>>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>>>>> Connecting
>>>>> to hadoop file system at: hdfs://localhost:8020
>>>>> 2010-12-09 20:35:32,333 [main] ERROR org.apache.pig.Main - ERROR 2999:
>>>>> Unexpected internal error. Failed to create DataStorage
>>>>>
>>>>> $ cat pig_1291944931735.log
>>>>> Error before Pig is launched
>>>>> ----------------------------
>>>>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>>>>>
>>>>> java.lang.RuntimeException: Failed to create DataStorage
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:216)
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:126)
>>>>> at org.apache.pig.impl.PigContext.connect(PigContext.java:184)
>>>>> at org.apache.pig.PigServer.<init>(PigServer.java:184)
>>>>> at org.apache.pig.PigServer.<init>(PigServer.java:173)
>>>>> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54)
>>>>> at org.apache.pig.Main.main(Main.java:354)
>>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020failed
>>>>> on
>>>>> local exception: java.io.EOFException
>>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>>>> at $Proxy0.getProtocolVersion(Unknown Source)
>>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>>>> at
>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>>>> at
>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>>>>> at
>>>>>
>>>>>
>>>>>
>>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>>>>> ... 8 more
>>>>> Caused by: java.io.EOFException
>>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>>> at
>>>>>
>>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>>>>>
>>>>> if I ran java -cp pig-0.7.0-core.jar org.apache.pig.Main -x mapreduce
>>>>> command, I can atleast see the grunt shell.
>>>>>
>>>>> However, when using hadoop commands
>>>>> $ hadoop fs -ls
>>>>> Found 1 items
>>>>> -rw-r--r--   1 testpig supergroup     454557 2010-12-09 19:31
>>>>> /user/testpig/access_log.2010-08-30-23-01.lzo
>>>>>
>>>>> everything seems to be fine connecting to hdfs.
>>>>>
>>>>> My environment have the following settings
>>>>> PIG_HOME=/home/testpig/pig-0.7.0
>>>>> HADOOP_HOME=/usr/lib/hadoop-0.20    (cloudera distribution)
>>>>> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20/conf
>>>>> JAVA_HOME=/usr/java/default
>>>>>
>>>>> pig-env.sh have the following setting
>>>>> export PIG_OPTS="$PIG_OPTS
>>>>> -Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64"
>>>>> export
>>>>>
>>>>>
>>>>>
>>>>> PIG_CLASSPATH=$PIG_CLASSPATH:/home/testpig/hadoop-lzo.jar:/home/testpig/elephant-bird.jar:/home/testpig/elephant-bird/lib/*
>>>>> export PIG_HADOOP_VERSION=20
>>>>>
>>>>>
>>>>> What is going on there?
>>>>>
>>>>> Thanks a lot.
>>>>>
>>>>> Felix
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message