COTs/Open-Source ETL tools exist to do this.   (Talend, Pentaho, CloverETL, etc.)
With those, you should be able to do this without writing any code.

All of the tools can read from a SQL database.  Then you just need to push the data into Cassandra.   Many of the ETL tools support web services, which is why I suggested a REST layer for Cassandra might be handy.  Using the ETL tool, you could push the data into Cassandra as JSON over REST.  (If you want, give Virgil a try)  

I haven't tried, but you might also be able to coax the ETL tools to use CQL.  

Some of the ETL tools are Map/Reduce friendly (more or less) and can distribute the job over a cluster.  But if you have a lot of data, you may also want to look at Pig and/or Map/Reduce directly.   If you stage the CSV/JSON file on HDFS, then a simple Map/Reduce job can load the data directly into Cassandra. (using a ColumnFamilyOutput format)

We are solving this problem right now, so I'll report back.


Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024

From: Maxim Potekhin <>
Organization: Brookhaven National Laboratory
Reply-To: <>
Date: Tue, 01 Nov 2011 14:18:00 -0400
To: <>
Subject: Re: Tool for SQL -> Cassandra data movement

Just a short comment -- we are going the CSV way as well because of its compactness and extreme portability.
The CSV files are kept in the cloud as backup. They can also find other uses. JSON would work as well, but
it would be at least twice as large in size.


On 9/22/2011 1:25 PM, Nehal Mehta wrote:
We are trying to carry out same stuff, but instead of migrating into JSON, we are exporting into CSV and than importing CSV into Cassandra.  Which DB are you currently using?

Nehal Mehta.

2011/9/22 Radim Kolar <>
I need tool which is able to dump tables via JDBC into JSON format for cassandra import. I am pretty sure that somebody already wrote that.

Are there tools which can do direct JDBC -> cassandra import?