cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood" <>
Subject RE: Map/Reduce Cassandra Output
Date Mon, 19 Apr 2010 18:20:08 GMT
If you used that snippet of code, all connections would go through the same seed: the input
code does additional work to determine which nodes are holding particular key ranges, and
then connects directly.


For outputting from Hadoop to Cassandra, you may want to consider using a Java client like
Hector, which will handle the load balancing for you.


-----Original Message-----
From: "Sonny Heer" <>
Sent: Monday, April 19, 2010 11:29am
Subject: Map/Reduce Cassandra Output

Different from the wordcount my input source is a directory, and I
have the a split class and record reader defined.

Different from wordcount during reduce I need to insert into
Cassandra.  I notice for the wordcount input it retrieves a handle on
a cassandra client like this:

        TSocket socket = new
        TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket,
false, false);
        Cassandra.Client client = new Cassandra.Client(binaryProtocol);

Would all hadoop nodes go to the same seed if i use this code to
insert data, without balancing it?  Has this been done somewhere in
the Cassandra code already?

View raw message