cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Robbie Strickland (JIRA)" <j...@apache.org>
Subject [jira] [Issue Comment Edited] (CASSANDRA-4208) ColumnFamilyOutputFormat should support writing to multiple column families
Date Thu, 10 May 2012 13:21:49 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-4208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272313#comment-13272313
] 

Robbie Strickland edited comment on CASSANDRA-4208 at 5/10/12 1:21 PM:
-----------------------------------------------------------------------

I've attached a patch that adds a  setOutputColumnFamily() overload that takes in both keyspace
and CF.  The one outstanding issue that I've commented on in CFOF is that checkOutputSpecs()
cannot currently ensure that a CF has been specified either through setOutputColumnFamily()
or MultipleOutputs.  

Unfortunately MultipleOutputs.getNamedOutputsList(), which would be the right way to do this,
is currently private.  So we either don't do the check and let it throw an NPE at runtime,
or we duplicate the code in MultipleOutputs to grab the values from config ourselves.  Not
sure which is the lesser of two evils. 
                
      was (Author: rstrickland):
    I've attached a patch that adds a  setOutputColumnFamily() overload that takes in both
keyspace and CF.  The one outstanding issue that I've commented on in CFOF is that checkOutputSpecs()
cannot currently ensure that a CF has been specified either through setOutputColumnFamily()
or MultipleOutputs.  

Unfortunately MultipleOutputs.getNamedOutputsList()--which would be the right way to do this--is
currently private.  So we either don't do the check and let it throw an NPE at runtime, or
we duplicate the code in MultipleOutputs to grab the values from config ourselves.  Not sure
which is the lesser of two evils. 
                  
> ColumnFamilyOutputFormat should support writing to multiple column families
> ---------------------------------------------------------------------------
>
>                 Key: CASSANDRA-4208
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4208
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>    Affects Versions: 1.1.0
>            Reporter: Robbie Strickland
>         Attachments: cassandra-1.1-4208-v2.txt, cassandra-1.1-4208.txt, trunk-4208-v2.txt,
trunk-4208.txt
>
>
> It is not currently possible to output records to more than one column family in a single
reducer.  Considering that writing values to Cassandra often involves multiple column families
(i.e. updating your index when you insert a new value), this seems overly restrictive.  I
am submitting a patch that moves the specification of column family from the job configuration
to the write() call in ColumnFamilyRecordWriter.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message