cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1368) Add output support for Hadoop Streaming
Date Wed, 25 Aug 2010 21:09:17 GMT


Jonathan Ellis commented on CASSANDRA-1368:

bq. If the client might be using an alternate Avro schema, they can specify it using the OUTPUT_SCHEMA_KEY

Is this likely to come up in practice or can we get rid of it?

> Add output support for Hadoop Streaming
> ---------------------------------------
>                 Key: CASSANDRA-1368
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>         Attachments: 0001-Switch-to-Cloudera-s-Distribution-of-Hadoop.patch, 0002-Add-an-Avro-OutputReader-and-Resolver-for-Hadoop-Str.patch,
0003-Apply-the-deprecated-OutputFormat-interface-to-allow.patch, 0004-Add-Streaming-example-shell-scripts.patch
> Hadoop Streaming is a framework that allows mapreduce jobs to be written in languages
other than Java, by performing simple IPC on stdin/stdout.
> Adding output support for Hadoop Streaming to Cassandra would mean that users could write
very simple scripts in dynamic languages to load data into Cassandra. Once our Hadoop OutputFormat
has stabilized a bit, we might also be able to this code to provide scalable bulk loading.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message