pig-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Daniel Dai <jiany...@yahoo-inc.com>
Subject Re: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed to create DataStorage
Date Fri, 10 Dec 2010 19:30:17 GMT
hadoop20.jar is more than hadoop-core.jar, it includes all hadoop 
classes and dependent libraries. Where did you get hadoop? Is that from 
CDH? which version is it?

Daniel

felix gao wrote:
> Daniel,
>
> Here is what I did, the jar is already build by cloudera, so I did
> mv hadoop-core-0.20.2+737.jar hadoop20.jar to pig's lib dir
>
> then I did
> java -Dfs.default.name=hdfs://localhost:8020
> -Dmapred.job.tracker=localhost:8021 -jar pig-0.7.0-core.jar
> 10/12/10 14:21:42 INFO pig.Main: Logging error messages to:
> /home/felix/pig-0.7.0/pig_1292008902688.log
> 2010-12-10 14:21:43,014 [main] INFO
>  org.apache.pig.backend.hadoop.executionengine.HExecutionEngine - Connecting
> to hadoop file system at: hdfs://localhost:8020
> 2010-12-10 14:21:43,275 [main] ERROR org.apache.pig.Main - ERROR 2999:
> Unexpected internal error. Failed to create DataStorage
>
> seems that still doesn't fix my problem.
>
> Felix
>
>
> On Fri, Dec 10, 2010 at 11:10 AM, Daniel Dai <jianyong@yahoo-inc.com> wrote:
>
>   
>> I didn't use Cloudera distribution before. Pig bundles Apache hadoop 0.20.2
>> client library. If Cloudera made some changes to hadoop, that could be an
>> issue.
>>
>> One thing you can try is build hadoop20.jar by yourself (
>> http://behemoth.strlen.net/~alex/hadoop20-pig-howto.txt), put it in lib
>> (replace the original hadoop20.jar).
>>
>> Daniel
>>
>>
>> felix gao wrote:
>>
>>     
>>> Daniel,
>>>
>>> No, I am using 0.20.2 from Cloudera.
>>> here is all the jar under pig's lib
>>> $ ls ~/pig-0.7.0/lib
>>> automaton.jar  hadoop-LICENSE.txt  hadoop-lzo.jar  hadoop18.jar
>>>  hadoop20.jar  hbase-0.20.0-test.jar  hbase-0.20.0.jar  jdiff
>>>  zookeeper-hbase-1329.jar
>>>
>>> $ ls $HADOOP_HOME
>>> CHANGES.txt  build.xml      hadoop-0.20.2+737-ant.jar
>>> hadoop-ant-0.20.2+737.jar       hadoop-examples.jar          ivy
>>>  webapps
>>> LICENSE.txt  cloudera       hadoop-0.20.2+737-core.jar      hadoop-ant.jar
>>>               hadoop-test-0.20.2+737.jar   ivy.xml
>>> NOTICE.txt   conf           hadoop-0.20.2+737-examples.jar
>>>  hadoop-core-0.20.2+737.jar      hadoop-test.jar              lib
>>> README.txt   contrib        hadoop-0.20.2+737-test.jar
>>>  hadoop-core.jar
>>>                hadoop-tools-0.20.2+737.jar  logs
>>> bin          example-confs  hadoop-0.20.2+737-tools.jar
>>> hadoop-examples-0.20.2+737.jar  hadoop-tools.jar             pids
>>>
>>>
>>> $ ls  $HADOOP_HOME/lib
>>> aspectjrt-1.6.5.jar           commons-logging-api-1.0.4.jar
>>>  jackson-mapper-asl-1.5.2.jar  junit-4.5.jar
>>> servlet-api-2.5-6.1.14.jar
>>> aspectjtools-1.6.5.jar        commons-net-1.4.1.jar
>>>  jasper-compiler-5.5.12.jar    kfs-0.2.2.jar
>>> slf4j-api-1.4.3.jar
>>> commons-cli-1.2.jar           core-3.1.1.jar
>>> jasper-runtime-5.5.12.jar     kfs-0.2.LICENSE.txt
>>> slf4j-log4j12-1.4.3.jar
>>> commons-codec-1.4.jar         hadoop-fairscheduler-0.20.2+737.jar  jdiff
>>>                    log4j-1.2.15.jar                    xmlenc-0.52.jar
>>> commons-daemon-1.0.1.jar      hadoop-lzo-0.4.6.jar
>>> jets3t-0.6.1.jar              mockito-all-1.8.2.jar
>>> commons-el-1.0.jar            hsqldb-1.8.0.10.LICENSE.txt
>>>  jetty-6.1.14.jar              mysql-connector-java-5.0.8-bin.jar
>>> commons-httpclient-3.0.1.jar  hsqldb-1.8.0.10.jar
>>>  jetty-util-6.1.14.jar         native
>>> commons-logging-1.0.4.jar     jackson-core-asl-1.5.2.jar           jsp-2.1
>>>                    oro-2.0.8.jar
>>>
>>> please tell me how to get this working with pig
>>>
>>> Thanks,
>>>
>>> Felix
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 10, 2010 at 12:20 AM, Daniel Dai <daijyc@gmail.com> wrote:
>>>
>>>
>>>
>>>       
>>>> Looks like hadoop client jar does not match the version of server side.
>>>> Are
>>>> you using hadoop 0.20.2 from Apache?
>>>>
>>>> Daniel
>>>>
>>>> -----Original Message----- From: felix gao
>>>> Sent: Thursday, December 09, 2010 5:48 PM
>>>> To: pig-user@hadoop.apache.org
>>>> Subject: Strange problem with Pig 0.7.0 and Hadoop 0.20.2 and Failed to
>>>> create DataStorage
>>>>
>>>>
>>>> I kept seening Failed to create DataStroage error when try to run pig
>>>>
>>>> $ java -cp pig-0.7.0-core.jar:$HADOOP_CONF_DIR org.apache.pig.Main -x
>>>> mapreduce
>>>> 10/12/09 20:35:31 INFO pig.Main: Logging error messages to:
>>>> /home/testpig/pig-0.7.0/pig_1291944931735.log
>>>> 2010-12-09 20:35:31,997 [main] INFO
>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
>>>> Connecting
>>>> to hadoop file system at: hdfs://localhost:8020
>>>> 2010-12-09 20:35:32,333 [main] ERROR org.apache.pig.Main - ERROR 2999:
>>>> Unexpected internal error. Failed to create DataStorage
>>>>
>>>> $ cat pig_1291944931735.log
>>>> Error before Pig is launched
>>>> ----------------------------
>>>> ERROR 2999: Unexpected internal error. Failed to create DataStorage
>>>>
>>>> java.lang.RuntimeException: Failed to create DataStorage
>>>> at
>>>>
>>>>
>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:75)
>>>> at
>>>>
>>>>
>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.<init>(HDataStorage.java:58)
>>>> at
>>>>
>>>>
>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:216)
>>>> at
>>>>
>>>>
>>>> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.init(HExecutionEngine.java:126)
>>>> at org.apache.pig.impl.PigContext.connect(PigContext.java:184)
>>>> at org.apache.pig.PigServer.<init>(PigServer.java:184)
>>>> at org.apache.pig.PigServer.<init>(PigServer.java:173)
>>>> at org.apache.pig.tools.grunt.Grunt.<init>(Grunt.java:54)
>>>> at org.apache.pig.Main.main(Main.java:354)
>>>> Caused by: java.io.IOException: Call to localhost/127.0.0.1:8020 failed
>>>> on
>>>> local exception: java.io.EOFException
>>>> at org.apache.hadoop.ipc.Client.wrapException(Client.java:775)
>>>> at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>>> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>>> at $Proxy0.getProtocolVersion(Unknown Source)
>>>> at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>>> at org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>>> at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>>> at
>>>>
>>>>
>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>>> at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>>> at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>> at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>>> at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:95)
>>>> at
>>>>
>>>>
>>>> org.apache.pig.backend.hadoop.datastorage.HDataStorage.init(HDataStorage.java:72)
>>>> ... 8 more
>>>> Caused by: java.io.EOFException
>>>> at java.io.DataInputStream.readInt(DataInputStream.java:375)
>>>> at
>>>> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:501)
>>>> at org.apache.hadoop.ipc.Client$Connection.run(Client.java:446)
>>>>
>>>> if I ran java -cp pig-0.7.0-core.jar org.apache.pig.Main -x mapreduce
>>>> command, I can atleast see the grunt shell.
>>>>
>>>> However, when using hadoop commands
>>>> $ hadoop fs -ls
>>>> Found 1 items
>>>> -rw-r--r--   1 testpig supergroup     454557 2010-12-09 19:31
>>>> /user/testpig/access_log.2010-08-30-23-01.lzo
>>>>
>>>> everything seems to be fine connecting to hdfs.
>>>>
>>>> My environment have the following settings
>>>> PIG_HOME=/home/testpig/pig-0.7.0
>>>> HADOOP_HOME=/usr/lib/hadoop-0.20    (cloudera distribution)
>>>> HADOOP_CONF_DIR=/usr/lib/hadoop-0.20/conf
>>>> JAVA_HOME=/usr/java/default
>>>>
>>>> pig-env.sh have the following setting
>>>> export PIG_OPTS="$PIG_OPTS
>>>> -Djava.library.path=$HADOOP_HOME/lib/native/Linux-amd64-64"
>>>> export
>>>>
>>>>
>>>> PIG_CLASSPATH=$PIG_CLASSPATH:/home/testpig/hadoop-lzo.jar:/home/testpig/elephant-bird.jar:/home/testpig/elephant-bird/lib/*
>>>> export PIG_HADOOP_VERSION=20
>>>>
>>>>
>>>> What is going on there?
>>>>
>>>> Thanks a lot.
>>>>
>>>> Felix
>>>>
>>>>
>>>>
>>>>         


Mime
View raw message