incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex McLintock <a...@owal.co.uk>
Subject Reading Cassandra Data From Pig/Hadoop
Date Fri, 30 May 2014 16:50:16 GMT
I am reasonably experienced with Hadoop and Pig but less so with Cassandra.
I have been banging my head against the wall as all the documentation
assumes I know something...

I am using Apache's tarball of Cassandra 1.something and I see that there
are some example pig scripts and a shell script to run them with the
cassandra jars.

What I don't understand is how you tell the pig script which machine the
cassandra cluster talks to. You only specify the keyspace right - which
roughly corresponds to the database/table, but not which cluster.

Can you tell what I have missed? Does the hadoop nodes HAVE to be on the
same machines as the Cassandra nodes?

I am using CQL storage I think.

eg

-- CqlStorage
libdata = LOAD 'cql://libdata/libout' USING CqlStorage();
book_by_mail = FILTER libdata BY C_OUT_TY == 'BM';
etc etc


Thanks all...

Mime
View raw message