cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mubarak Seyed <>
Subject Re: CassandraBulkLoader
Date Tue, 13 Jul 2010 17:31:52 GMT
Thanks Torsten.

Jonathan's blog on Fact Vs Fiction says that

Fact: It has always been straightforward to send the output of Hadoop jobs
to Cassandra, and Facebook, Digg, and others have been using Hadoop like
this as a Cassandra bulk-loader for over a year.

Does anyone from Facebook or Digg share details on how to use Cassandra

I could see some details from Arin's presentation on Cassandra @ Digg about
data load from MySQL -> Hadoop -> Cassandra.

Can someone please help me?


On Tue, Jul 13, 2010 at 1:27 AM, Torsten Curdt <> wrote:

> On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed <>
> wrote:
> > Where can i find the documentation for BinaryMemTable (btm_example in
> contrib)
> > to use CassandraBulkLoader? What is the input to be supplied to
> CassandraBulkLoader?
> > How to form the input data and what is the format of an input data?
> The code is the documentation I fear.
> I'll see if I get permission to get our updated code contributed.
> We added command line fu and using it to import large TSVs.
> > Do i need the HDFS to store my storage-conf.xml?
> Why HDFS?
> The machine running the bulk loader joins the cassandra ring kind of
> like a temporary node.
> So you will need the storage-conf.xml on that machine.
> cheers
> --
> Torsten

Mubarak Seyed.

View raw message