incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Hanna <jeremy.hanna1...@gmail.com>
Subject Re: pig integration & NoClassDefFoundError TypeParser
Date Mon, 20 Jun 2011 18:44:37 GMT
Try running with cdh3u0 version of pig and see if it has the same problem.  They backported
the patch (to pig 0.9 which should be out in time for the hadoop summit next week) that adds
the updated jackson dependency for avro.  The download URL for that is - http://archive.cloudera.com/cdh/3/pig-0.8.0-cdh3u0.tar.gz

Alternatively, I believe today brisk beta 2 will be out which has pig integrated.  Not sure
if that would work for your current environment though.

See if that works.
On Jun 20, 2011, at 1:09 PM, Sasha Dolgy wrote:

> Been trying for the past little bit to try and get the PIG integration
> working with Cassandra 0.8.0
> 
> 1.  Downloaded the src for 0.8.0 and ran ant build
> 2.  went into contrib/pig and ran ant ... gives me:
> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/build/cassandra_storage.jar
> and is copied into the lib/ directory
> 3.  Downloaded pig-0.8.1, modified the ivy/libraries.properties so
> that it uses Jackson 1.8.2 .. and ran ant.  it compiles and gives me
> two jars:  pig-0.8.1-SNAPSHOT-core.jar and pig-0.8.1-SNAPSHOT.jar
> ----- I did try to run it with Jackson 1.4 as the
> contrib/pig/README.txt suggested, but that failed...  The referenced
> JIRA ticket (PIG-1863) suggests 1.6.0 (still produces the same
> results)
> 
> Environment variables are set:
> java version "1.6.0_24"
> 
> PIG_INITIAL_ADDRESS=localhost
> PIG_HOME=/usr/local/src/pig-0.8.1
> PIG_PARTITIONER=org.apache.cassandra.dht.RandomPartitioner
> PIG_RPC_PORT=9160
> CASSANDRA_HOME=/usr/local/src/apache-cassandra-0.8.0-src
> 
> I then start up cassandra ... no issues.  I connect and create a new
> keyspace called foo with a column family called bar and a CF called
> foo...Inside the CF bar, I create a few rows, with random columns ....
> 4 Rows.
> 
> From contrib/pig I run:  bin/pig_cassandra -x local ... immediately
> get the error:
> 
> [: 45: /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar: unexpected operator
> 
> -- this is a reference to this line:  if [ ! -e $PIG_JAR ]; then
> 
> *** Problem here is that $PIG_JAR is a reference to two files ...
> pig-0.8.1-core.jar & pig.jar ...
> 
> Changing line 44 to PIG_JAR=$PIG_HOME/pig*core*.jar fixes this ... (or
> even referencing $PIG_HOME/build/pig*core*.jar or just pig.jar
> 
> Try again to run:  bin/pig_cassandra -x local and everything loads up nicely:
> 
> 2011-06-21 02:07:23,671 [main] INFO  org.apache.pig.Main - Logging
> error messages to:
> /usr/local/src/apache-cassandra-0.8.0-src/contrib/pig/pig_1308593243668.log
> 2011-06-21 02:07:23,778 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> Connecting to hadoop file system at: file:///
> grunt> register /usr/local/src/pig-0.8.1/pig-0.8.1-core.jar; register
> /usr/local/src/pig-0.8.1/pig.jar; register
> /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-fixes.jar;
> register /usr/local/src/apache-cassandra-0.8.0-src/lib/avro-1.4.0-sources-fixes.jar;
> register /usr/local/src/apache-cassandra-0.8.0-src/lib/libthrift-0.6.jar;
> grunt>
> grunt> rows = LOAD 'cassandra://foo/bar' USING CassandraStorage();
> grunt> STORE rows into 'cassandra://foo/foo' USING CassandraStorage();
> 2011-06-21 02:04:53,271 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
> script: UNKNOWN
> 2011-06-21 02:04:53,271 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> pig.usenewlogicalplan is set to true. New logical plan will be used.
> 2011-06-21 02:04:53,324 [main] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Initializing JVM Metrics
> with processName=JobTracker, sessionId=
> 2011-06-21 02:04:53,447 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.HExecutionEngine -
> (Name: rows: Store(cassandra://foo/foo:CassandraStorage) - scope-1
> Operator Key: scope-1)
> 2011-06-21 02:04:53,458 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler
> - File concatenation threshold: 100 optimistic? false
> 2011-06-21 02:04:53,477 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size before optimization: 1
> 2011-06-21 02:04:53,477 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer
> - MR plan size after optimization: 1
> 2011-06-21 02:04:53,480 [main] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:04:53,494 [main] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:04:53,494 [main] INFO
> org.apache.pig.tools.pigstats.ScriptState - Pig script settings are
> added to the job
> 2011-06-21 02:04:53,556 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - mapred.job.reduce.markreset.buffer.percent is not set, set to
> default 0.3
> 2011-06-21 02:04:59,700 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler
> - Setting up single store job
> 2011-06-21 02:04:59,718 [main] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:04:59,719 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 1 map-reduce job(s) waiting for submission.
> 2011-06-21 02:04:59,948 [Thread-5] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:04:59,960 [Thread-5] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:04:59,980 [Thread-5] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input paths (combined) to process : 1
> 2011-06-21 02:05:00,220 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 0% complete
> 2011-06-21 02:05:00,322 [Thread-14] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:05:00,340 [Thread-14] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input paths (combined) to process : 1
> 2011-06-21 02:05:00,372 [Thread-14] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:05:00,374 [Thread-14] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:05:00,378 [Thread-14] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:05:00,381 [Thread-14] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> 2011-06-21 02:05:00,491 [Thread-14] WARN
> org.apache.hadoop.mapred.LocalJobRunner - job_local_0001
> java.lang.NoClassDefFoundError: org/apache/cassandra/db/marshal/TypeParser
>        at org.apache.cassandra.hadoop.pig.CassandraStorage.getDefaultMarshallers(Unknown
> Source)
>        at org.apache.cassandra.hadoop.pig.CassandraStorage.columnToTuple(Unknown
> Source)
>        at org.apache.cassandra.hadoop.pig.CassandraStorage.getNext(Unknown
> Source)
>        at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:187)
>        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:423)
>        at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> Caused by: java.lang.ClassNotFoundException:
> org.apache.cassandra.db.marshal.TypeParser
>        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
>        at java.security.AccessController.doPrivileged(Native Method)
>        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
>        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
>        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
>        ... 10 more
> 2011-06-21 02:05:00,818 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - HadoopJobId: job_local_0001
> 2011-06-21 02:05:05,408 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - job job_local_0001 has failed! Stop running all dependent jobs
> 2011-06-21 02:05:05,411 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - 100% complete
> 2011-06-21 02:05:05,412 [main] ERROR
> org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s)
> failed!
> 2011-06-21 02:05:05,412 [main] INFO
> org.apache.pig.tools.pigstats.PigStats - Detected Local mode. Stats
> reported below may be incomplete
> 2011-06-21 02:05:05,413 [main] INFO
> org.apache.pig.tools.pigstats.PigStats - Script Statistics:
> 
> HadoopVersion   PigVersion      UserId  StartedAt       FinishedAt      Features
> 0.20.2  0.8.1   root    2011-06-21 02:04:53     2011-06-21 02:05:05     UNKNOWN
> 
> Failed!
> 
> Failed Jobs:
> JobId   Alias   Feature Message Outputs
> job_local_0001  rows    MAP_ONLY        Message: Job failed!
> cassandra://foo/foo,
> 
> Input(s):
> Failed to read data from "cassandra://foo/bar"
> 
> Output(s):
> Failed to produce result in "cassandra://foo/foo"
> 
> Job DAG:
> job_local_0001
> 
> 
> 2011-06-21 02:05:05,413 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
> - Failed!
> 2011-06-21 02:05:05,416 [main] INFO
> org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM
> Metrics with processName=JobTracker, sessionId= - already initialized
> grunt>
> 
> 
> Any help or insight is appreciated ....


Mime
View raw message