incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mck <...@apache.org>
Subject (newbie) ColumnFamilyOutputFormat only writes one column (per key)
Date Sun, 21 Nov 2010 18:43:26 GMT
(I'm new here so forgive any mistakes or mis-presumptions...)

I've set up a cassandra-0.7.0-beta3 and populated it with
thrift-serialised objects via a scribe server. This seems a great way to
get thrift beans out of the application asap and have then sitting in
cassandra for later processing.

I then went to write a m/r job that deserialises the thrift objects and
aggregates the data accordingly into a new column family. But what i've
found is that ColumnFamilyOutputFormat will only write out one column
per key.

Alex Burkoff also reported this nearly two months ago, but nobody ever
replied...
 http://article.gmane.org/gmane.comp.db.cassandra.user/9325

has anyone any ideas? 
should it be possible to write multiple columns out?

This is very easy to reproduce. Use the contrib/wordcount example, with
OUTPUT_REDUCER=cassandra and in WordCount.java add at line 132

>              results.add(getMutation(key, sum));
> +            results.add(getMutation(new Text("doubled"), sum*2));

Only the last mutation for any key seems to be written.


~mck

-- 
echo '[q]sa[ln0=aln256%
Pln256/snlbx]sb3135071790101768542287578439snlbxq'|dc 

| www.semb.wever.org | www.sesat.no 
| www.finn.no        | http://xss-http-filter.sf.net


Mime
View raw message