incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: pig + hadoop
Date Wed, 20 Apr 2011 03:29:18 GMT
Just as an example:

  <property>
    <name>cassandra.thrift.address</name>
    <value>10.12.34.56</value>
  </property>
  <property>
    <name>cassandra.thrift.port</name>
    <value>9160</value>
  </property>
  <property>
    <name>cassandra.partitioner.class</name>
    <value>org.apache.cassandra.dht.RandomPartitioner</value>
  </property>


On Apr 19, 2011, at 10:28 PM, Jeremy Hanna wrote:

> oh yeah - that's what's going on.  what I do is on the machine that I run the pig script
from, I set the PIG_CONF variable to my HADOOP_HOME/conf directory and in my mapred-site.xml
file found there, I set the three variables.
> 
> I don't use environment variables when I run against a cluster.
> 
> On Apr 19, 2011, at 9:54 PM, Jeffrey Wang wrote:
> 
>> Did you set PIG_RPC_PORT in your hadoop-env.sh? I was seeing this error for a while
before I added that.
>> 
>> -Jeffrey
>> 
>> From: pob [mailto:peterob333@gmail.com] 
>> Sent: Tuesday, April 19, 2011 6:42 PM
>> To: user@cassandra.apache.org
>> Subject: Re: pig + hadoop
>> 
>> Hey Aaron,
>> 
>> I read it, and all of 3 env variables was exported. The results are same.
>> 
>> Best,
>> P
>> 
>> 2011/4/20 aaron morton <aaron@thelastpickle.com>
>> Am guessing but here goes. Looks like the cassandra RPC port is not set, did you
follow these steps in contrib/pig/README.txt
>> 
>> Finally, set the following as environment variables (uppercase,
>> underscored), or as Hadoop configuration variables (lowercase, dotted):
>> * PIG_RPC_PORT or cassandra.thrift.port : the port thrift is listening on 
>> * PIG_INITIAL_ADDRESS or cassandra.thrift.address : initial address to connect to
>> * PIG_PARTITIONER or cassandra.partitioner.class : cluster partitioner
>> 
>> Hope that helps. 
>> Aaron
>> 
>> 
>> On 20 Apr 2011, at 11:28, pob wrote:
>> 
>> 
>> Hello, 
>> 
>> I did cluster configuration by http://wiki.apache.org/cassandra/HadoopSupport. When
I run pig example-script.pig 
>> -x local, everything is fine and i get correct results.
>> 
>> Problem is occurring with -x mapreduce 
>> 
>> Im getting those errors :>
>> 
>> 
>> 2011-04-20 01:24:21,791 [main] ERROR org.apache.pig.tools.pigstats.PigStats - ERROR:
java.lang.NumberFormatException: null
>> 2011-04-20 01:24:21,792 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil -
1 map reduce job(s) failed!
>> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.tools.pigstats.PigStats - Script
Statistics: 
>> 
>> Input(s):
>> Failed to read data from "cassandra://Keyspace1/Standard1"
>> 
>> Output(s):
>> Failed to produce result in "hdfs://ip:54310/tmp/temp-1383865669/tmp-1895601791"
>> 
>> Counters:
>> Total records written : 0
>> Total bytes written : 0
>> Spillable Memory Manager spill count : 0
>> Total bags proactively spilled: 0
>> Total records proactively spilled: 0
>> 
>> Job DAG:
>> job_201104200056_0005   ->      null,
>> null    ->      null,
>> null
>> 
>> 
>> 2011-04-20 01:24:21,793 [main] INFO  org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Failed!
>> 2011-04-20 01:24:21,803 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1066:
Unable to open iterator for alias topnames. Backend error : java.lang.NumberFormatException:
null
>> 
>> 
>> 
>> ====
>> thats from jobtasks web management - error  from task directly:
>> 
>> java.lang.RuntimeException: java.lang.NumberFormatException: null
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:123)
>> at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.initialize(PigRecordReader.java:176)
>> at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:418)
>> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:620)
>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>> at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: java.lang.NumberFormatException: null
>> at java.lang.Integer.parseInt(Integer.java:417)
>> at java.lang.Integer.parseInt(Integer.java:499)
>> at org.apache.cassandra.hadoop.ConfigHelper.getRpcPort(ConfigHelper.java:233)
>> at org.apache.cassandra.hadoop.ColumnFamilyRecordReader.initialize(ColumnFamilyRecordReader.java:105)
>> ... 5 more
>> 
>> 
>> 
>> Any suggestions where should be problem?
>> 
>> Thanks,
>> 
>> 
>> 
> 


Mime
View raw message