incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Campbell <ja...@breachintelligence.com>
Subject BulkOutputFormat and CQL3
Date Tue, 22 Apr 2014 13:43:48 GMT
Hi Cassandra Users-

I have a Hadoop job that uses the pattern in Cassandra 2.0.6's hadoop_cql3_word_count example
to load data from HDFS into Cassandra.  Having read about BulkOutputFormat as a way to potentially
significantly increase the write throughput from Hadoop to Cassandra, I am considering testing
against that pattern (http://www.datastax.com/dev/blog/improved-hadoop-output, http://shareitexploreit.blogspot.com/2012/03/bulkloadto-cassandra-with-hadoop.html
).

Is it possible/supported/recommended to use the BulkOutputFormat to load data from Hadoop
to a CQL3 table in Cassandra?

I see several examples of building composite keys using Hector (e.g. http://www.datastax.com/dev/blog/introduction-to-composite-columns-part-1,
http://brianoneill.blogspot.com/2012/09/composite-keys-connecting-dots-between.html ), but
with the changes to support CQL3 having left a lot of different documentation out there for
different versions, it's not clear to me what the "proper" way to build the requisite ByteBuffer,
List<Mutation> pairs that the ColumnFamilyOutputFormat (and so BulkOutputFormat) needs.

James





Mime
View raw message