incubator-cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonny Heer <sonnyh...@gmail.com>
Subject Re: Map/Reduce Cassandra Output
Date Mon, 19 Apr 2010 21:38:12 GMT
Thanks Stu.  I will take a look at Hector.  Do you know where the
input code does the additional work?



On Mon, Apr 19, 2010 at 11:20 AM, Stu Hood <stu.hood@rackspace.com> wrote:
> If you used that snippet of code, all connections would go through the same seed: the
input code does additional work to determine which nodes are holding particular key ranges,
and then connects directly.
>
> ----
>
> For outputting from Hadoop to Cassandra, you may want to consider using a Java client
like Hector, which will handle the load balancing for you.
>
> http://github.com/rantav/hector
>
> Thanks,
> Stu
>
> -----Original Message-----
> From: "Sonny Heer" <sonnyheer@gmail.com>
> Sent: Monday, April 19, 2010 11:29am
> To: cassandra-user@incubator.apache.org
> Subject: Map/Reduce Cassandra Output
>
> Different from the wordcount my input source is a directory, and I
> have the a split class and record reader defined.
>
> Different from wordcount during reduce I need to insert into
> Cassandra.  I notice for the wordcount input it retrieves a handle on
> a cassandra client like this:
>
>        TSocket socket = new
> TSocket(DatabaseDescriptor.getSeeds().iterator().next().getHostAddress(),
>                                     DatabaseDescriptor.getThriftPort());
>        TBinaryProtocol binaryProtocol = new TBinaryProtocol(socket,
> false, false);
>        Cassandra.Client client = new Cassandra.Client(binaryProtocol);
>
> Would all hadoop nodes go to the same seed if i use this code to
> insert data, without balancing it?  Has this been done somewhere in
> the Cassandra code already?
>
>
>

Mime
View raw message