cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jonathan Ellis <jbel...@gmail.com>
Subject Re: Ingesting from Hadoop to Cassandra
Date Thu, 21 May 2009 14:44:59 GMT
Have you benchmarked the batch insert apis?  If that is "fast enough"
then it's by far the simplest way to go.

Otherwise you'll have to use the binarymemtable stuff which is
undocumented and not exposed as a client api (you basically write a
custom "loader" version of cassandra to use it, I think).  FB used
this for their own bulk loading so it works at some level, but clearly
there is some assembly required.

-Jonathan

On Thu, May 21, 2009 at 2:28 AM, Alexandre Linares <linares@ymail.com> wrote:
> Hi all,
>
> I'm trying to find the most optimal way to ingest my content from Hadoop to
> Cassandra.  Assuming I have figured out the table representation for this
> content, what is the best way to do go about pushing from my cluster?  What
> Cassandra client batch APIs do you suggest I use to push to Cassandra? I'm
> sure this is a common pattern, I'm curious to see how it has been
> implemented.  Assume millions of of rows and 1000s of columns.
>
> Thanks in advance,
> -Alex
>
>

Mime
View raw message