cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Johan Oskarsson <>
Subject Re: Cassandra + Hadoop + BMT
Date Tue, 01 Sep 2009 19:48:09 GMT
I have slapped together a basic Hadoop 0.18 CassandraOutputFormat based 
on the code Chris put up.


conf.set(CassandraOutputFormat.CONF_COLUMN_FAMILY_NAME, "columnfamilyname");
conf.set(CassandraOutputFormat.CONF_KEYSPACE, "keyspacename");

DistributedCache.addCacheFile(new URI("uri_to_storage-conf.xml"), conf);

+ your job specific settings.

Then after the job run this method: CassandraOutputFormat.forceFlush

Source code here:

Big thanks to Chris for figuring out the mystery that is BinaryMemtable


Chris Goffinet wrote:
> Hi Guys
> This is long overdue but I have posted a very rough rough example (with 
> Digg stuff removed) for getting BMT working with Cassandra. Patches are 
> coming next up for the JIRA tickets. I'll try to get a more generic 
> map/reduce job finished by end of the week that integrates Hive output.
> -Chris

View raw message