cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stu Hood (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1368) Add output support for Hadoop Streaming
Date Wed, 25 Aug 2010 22:06:17 GMT


Stu Hood commented on CASSANDRA-1368:

> Is this likely to come up in practice or can we get rid of it?
Ack... I don't think it is actually implemented in this patch yet. Without adding it, changing
the Avro client API will break Hadoop Streaming clients.

I should fix that before we commit.

> Add output support for Hadoop Streaming
> ---------------------------------------
>                 Key: CASSANDRA-1368
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>         Attachments: 0001-Switch-to-Cloudera-s-Distribution-of-Hadoop.patch, 0002-Add-an-Avro-OutputReader-and-Resolver-for-Hadoop-Str.patch,
0003-Apply-the-deprecated-OutputFormat-interface-to-allow.patch, 0004-Add-Streaming-example-shell-scripts.patch
> Hadoop Streaming is a framework that allows mapreduce jobs to be written in languages
other than Java, by performing simple IPC on stdin/stdout.
> Adding output support for Hadoop Streaming to Cassandra would mean that users could write
very simple scripts in dynamic languages to load data into Cassandra. Once our Hadoop OutputFormat
has stabilized a bit, we might also be able to this code to provide scalable bulk loading.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message