cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Radovan Zvoncek (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-8367) Clash between Cassandra and Crunch mapreduce config
Date Mon, 24 Nov 2014 14:03:12 GMT
Radovan Zvoncek created CASSANDRA-8367:
------------------------------------------

             Summary: Clash between Cassandra and Crunch mapreduce config
                 Key: CASSANDRA-8367
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8367
             Project: Cassandra
          Issue Type: Bug
          Components: Hadoop
            Reporter: Radovan Zvoncek
            Priority: Minor


We would like to use Cassandra's (Cql)BulkOutputFormats to implement Resource IOs for Crunch.
We want to do this to allow Crunch users write results of their jobs directly to Cassandra
(thus skipping writing them to file system).

In the process of doing this, we found out there is a clash in the mapreduce job config. The
affected config key is 'mapreduce.output.basename'. Cassandra is using it [1] for something
different than Crunch [2]. This is resulting in some obscure behavior I personally don't understand,
but it causes the jobs to fail.

We went ahead and re-implemented the output format classes to use different config key, but
we'd very much like to stop using them.

[1] https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/hadoop/ConfigHelper.java#L54
[2] https://github.com/apache/crunch/blob/3f13ee65c9debcf6bd7366607f58beae6c73ffe2/crunch-core/src/main/java/org/apache/crunch/io/CrunchOutputs.java#L99




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message