incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Lumby <johnlu...@hotmail.com>
Subject RE: cassandra hadoop reducer writing to CQL3 - primary key - must it be text type?
Date Wed, 09 Oct 2013 13:40:06 GMT
I don't know what happened to my original post but it got truncated.

Let me try again :

    software versions : apache-cassandra-2.0.1    hadoop-2.1.0-beta

I have been experimenting with using hadoop for a map/reduce operation on cassandra,
outputting to the CqlOutputFormat.class.
I based my first program fairly closely on the famous WordCount example in
examples/hadoop_cql3_word_count
except --- I set my output colfamily to have a bigint primary key :

CREATE TABLE archive_recordids ( recordid bigint , count_num bigint, PRIMARY KEY (recordid))

and simply tried setting this key as one of the keys in the output map :

         keys.put("recordid", ByteBufferUtil.bytes(recordid.longValue()));

but it always failed with a strange error :

java.io.IOException: InvalidRequestException(why:Key may not be empty)

After trying to make it more similar to WordCount,
I eventually realized the one difference was datatype of the primary key
of the output colfamily:
WordCount has text
I had bigint

I changed mine to text :

CREATE TABLE archive_recordids ( recordid text , count_num bigint, PRIMARY KEY (recordid))

and set the primary key *twice* in the reducer :
       keys.put("recordid", ByteBufferUtil.bytes(String.valueOf(recordid)));
       context.getConfiguration().set(PRIMARY_KEY,String.valueOf(recordid));

and it then worked perfectly.

Is there a restriction in cassandra-hadoop-cql support that
the output colfamily's primary key(s) must be text?
And does that also apply to DELETE?
Or am I doing it wrong?
Or maybe there is some other OutputFormatter that I could use that would work?

Cheers,   John 		 	   		  
Mime
View raw message