cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robbie Strickland (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-4208) ColumnFamilyOutputFormat should support writing to multiple column families
Date Thu, 03 May 2012 15:24:49 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Robbie Strickland updated CASSANDRA-4208:
-----------------------------------------

    Attachment: trunk-4208-v2.txt

I've added a patch to allow support for MultipleOutputs. Hadoop trunk now contains a new version
of MultipleOutputs that should support this out of the box, although I am submitting a patch
to deal with an inconsistency that could cause future issues with non-file formats.

The basic solution involves changing the config key for output CF to match the "basename"
key being written by MultipleOutputs. I had to make related changes to CassandraStorage and
TestRingCache, as well as some minor changes to ColumnFamilyInputFormat to account for some
interface changes in Hadoop trunk.

So the bottom line is this will work if people use Hadoop and Cassandra trunk with both patches
applied. The original patch can be used as a temporary solution if needed.
                
> ColumnFamilyOutputFormat should support writing to multiple column families
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4208
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4208
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>    Affects Versions: 1.1.0
>            Reporter: Robbie Strickland
>         Attachments: trunk-4208-v2.txt, trunk-4208.txt
>
>
> It is not currently possible to output records to more than one column family in a single
reducer.  Considering that writing values to Cassandra often involves multiple column families
(i.e. updating your index when you insert a new value), this seems overly restrictive.  I
am submitting a patch that moves the specification of column family from the job configuration
to the write() call in ColumnFamilyRecordWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message