cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-913) Add Hive support
Date Tue, 30 Mar 2010 15:45:27 GMT


Jonathan Ellis commented on CASSANDRA-913:

Starting points:

The Cassandra inputformat for Hadoop is in org.apache.cassandra.hadoop.ColumnFamilyInputFormat;
the record reader and input split are in the same package.  There's an example of using these
in contrib/word_count, and Pig integration in contrib/pig.

You can look at the .7 patch to HIVE-705 to see how HBase support was added.  Unfortunately
this is not split into "Hive infrastructure refactoring" and "HBase support," they are all
mixed in together.

> Add Hive support
> ----------------
>                 Key: CASSANDRA-913
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Contrib
>            Reporter: Jonathan Ellis
> is a project that runs SQL queries against Hadoop map/reduce
clusters.  (For analytics; it is too high-latency to run applications against Hive directly).
 HIVE-705 added support for backends other than HDFS, with HBase as the first.  Cassandra
support should be doable too now.
> The Hive storage backends are described in
and the HBase backend specifically in
> I also note that John Sichi, author of the HBase backend, seems like a helpful guy and
I imagine would be totally cool with answering questions about implementation details.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message