cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Liu (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-5867) The Pig CqlStorage/AbstractCassandraStorage classes don't handle collection types
Date Thu, 15 Aug 2013 16:53:01 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-5867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13741171#comment-13741171
] 

Alex Liu edited comment on CASSANDRA-5867 at 8/15/13 4:52 PM:
--------------------------------------------------------------

The key of Cassandra map has been converted to string into Pig map. We need to decide whether
use tuple of tuples vs map for Cassandra map type. Tuple of tuples is more general than map,
and map is more specific. HBase uses map to map its row. So which one to use for map? Map
vs Tuple of tuples?
                
      was (Author: alexliu68):
    The key of Cassandra map has been convert to string to put into Pig map. We need to decide
whether use tuple of tuples vs map for Cassandra map type. Tuple of tuples is more general
than map, and map is more specific. HBase uses map to map its row. So which one to use for
map? Map vs Tuple of tuples?
                  
> The Pig CqlStorage/AbstractCassandraStorage classes don't handle collection types
> ---------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-5867
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-5867
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Jeremy Hanna
>            Assignee: Alex Liu
>              Labels: pig
>         Attachments: 5867-1.2-branch.txt, 5867-2-1.2-branch.txt
>
>
> The CqlStorage class gets the Pig data type for values from the AbstractCassandraStorage
class, in the getPigType method.  If it isn't a known data type, it makes the value into a
ByteArray.  Currently there aren't any cases there for lists, maps, and sets.
> https://github.com/apache/cassandra/blob/cassandra-1.2.8/src/java/org/apache/cassandra/hadoop/pig/AbstractCassandraStorage.java#L336
> See this describe output from the grunt shell:
> {code}
> grunt> describe listdata ;                                        
> listdata: {id: (name: chararray,value: int),alist: (name: chararray,value: bytearray),amap:
(name: chararray,value: bytearray),aset: (name: chararray,value: bytearray)}
> {code}
> where the cql data structures had this schema:
> {code}
> CREATE TABLE alltypes (
>   id int PRIMARY KEY,
>   alist list<text>,
>   amap map<text, text>,
>   aset set<text>
> {code}
> It turns out that if you cast the map in grunt to a pig map, then it sort of works, but
I don't think we should probably use a pig map.  Lists don't appear to work at all, as there
is no Pig analogue.  I *think* you could probably just do a UDF to cast these things, but
we already have all of the type information, so we just need to change them to tuples or bags
or whatever.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message