cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Ellis (JIRA)" <>
Subject [jira] Commented: (CASSANDRA-1368) Add output support for Hadoop Streaming
Date Tue, 24 Aug 2010 22:23:17 GMT


Jonathan Ellis commented on CASSANDRA-1368:

or even simpler: allow specifying separator characters for rows and columns (iianm this is
what regular hadoop streaming does)

> Add output support for Hadoop Streaming
> ---------------------------------------
>                 Key: CASSANDRA-1368
>                 URL:
>             Project: Cassandra
>          Issue Type: New Feature
>          Components: Hadoop
>            Reporter: Stu Hood
>             Fix For: 0.7 beta 2
>         Attachments: 0001-Switch-to-Cloudera-s-Distribution-of-Hadoop.patch, 0002-Add-an-Avro-OutputReader-and-Resolver-for-Hadoop-Str.patch,
0003-Apply-the-deprecated-OutputFormat-interface-to-allow.patch, 0004-Add-Streaming-example-shell-scripts.patch
> Hadoop Streaming is a framework that allows mapreduce jobs to be written in languages
other than Java, by performing simple IPC on stdin/stdout.
> Adding output support for Hadoop Streaming to Cassandra would mean that users could write
very simple scripts in dynamic languages to load data into Cassandra. Once our Hadoop OutputFormat
has stabilized a bit, we might also be able to this code to provide scalable bulk loading.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message