cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jeremy Hanna (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-2658) Pig + CassandraStorage should work when trying to cast data after it's loaded
Date Mon, 16 May 2011 16:19:53 GMT
Pig + CassandraStorage should work when trying to cast data after it's loaded
-----------------------------------------------------------------------------

                 Key: CASSANDRA-2658
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2658
             Project: Cassandra
          Issue Type: Bug
    Affects Versions: 0.7.5
            Reporter: Jeremy Hanna
            Priority: Minor


We currently do a lot with pig + cassandra, but one thing I've found is that currently it's
very touchy with data that comes from Cassandra for some reason.  For example, if I try to
a SUM of data that has not been validated as an LongType in Cassandra, it borks.  See this
schema script for Cassandra - https://github.com/jeromatron/pygmalion/blob/master/cassandra/example_data.txt
- and remove the validation on the num_heads data type and try to SUM that over the data and
it gives data type errors.  (It breaks with the num_heads validation removed and with or without
the default_validation class being set.)

We currently do analysis over data that is either just String (UTF8) data or that we have
validated, so it works for us.  However, I've seen a couple of people trying to use Cassandra
with Pig that have had issues because of this.  One of the tenants of pig is that it will
eat anything and it kind of goes against this if the load/store somehow interferes with that.
 So in essence, I think this is a big deal for those wanting to use pig with cassandra in
the ways that pig is normally used.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message