cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Benoit Mathieu <>
Subject hadoop map join with ColumnFamilyInputFormat
Date Thu, 01 Mar 2012 10:45:09 GMT
Hi all,

I want to write a MapReduce job with a Map task taking its data from 2
CFs. Those 2 CFs have the same row keys and are in same keyspace, so
they are partionned the same way across my cluster and it would be
nice that the Map task reads the both column families locally.

In hadoop package org.apache.hadoop.mapred.join, there is a
CompositeInputFormat class, which seems to do what I want, but it
seems related to HDFS files as the "compose" method takes "Path" args.

Does anyone have ever wrote a CompositeColumnFamilyInputFormat ? or
have any insight about it ?



View raw message