hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Djatsa <>
Subject Re: Recommendations on moving to Hadoop/Hive with Cassandra + RDBMS
Date Tue, 30 Aug 2011 12:05:07 GMT
Hi Tharindu, try having a look at Brisk( it integrates Hadoop with Cassandra
and is shipped with Hive for SQL analysis. You can then install Sqoop( on top of Hadoop in order to
enable data import/export between Hadoop and MySQL.
Does this sound ok to you ?

2011/8/29 Tharindu Mathew <>

> Hi,
> I have an already running system where I define a simple data flow (using a
> simple custom data flow language) and configure jobs to run against stored
> data. I use quartz to schedule and run these jobs and the data exists on
> various data stores (mainly Cassandra but some data exists in RDBMS like
> mysql as well).
> Thinking about scalability and already existing support for standard data
> flow languages in the form of Pig and HiveQL, I plan to move my system to
> Hadoop.
> I've seen some efforts on the integration of Cassandra and Hadoop. I've
> been reading up and still am contemplating on how to make this change.
> It would be great to hear the recommended approach of doing this on Hadoop
> with the integration of Cassandra and other RDBMS. For example, a sample
> task that already runs on the system is "once in every hour, get rows from
> column family X, aggregate data in columns A, B and C and write back to
> column family Y, and enter details of last aggregated row into a table in
> mysql"
> Thanks in advance.
> --
> Regards,
> Tharindu

*Eric Djatsa Yota*
*Double degree MsC Student in Computer Science Engineering and Communication
Télécom ParisTech (FRANCE) - Politecnico di Torino (ITALY)*
*Intern at AMADEUS S.A.S Sophia Antipolis*
*Tel : 0601791859*

View raw message