pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Arthur (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (PIG-2611) HBaseStorage not casting correctly
Date Fri, 23 Mar 2012 14:37:29 GMT
HBaseStorage not casting correctly
----------------------------------

                 Key: PIG-2611
                 URL: https://issues.apache.org/jira/browse/PIG-2611
             Project: Pig
          Issue Type: Bug
    Affects Versions: 0.9.2
         Environment: Ubuntu 11.10, Hadoop 0.20.2, HBase 0.92.0
            Reporter: David Arthur
            Priority: Minor


When loading data into HBase with HBaseStorage, there is unexpected behavior regarding record
schema and casting.

Here is the relevant code snippet:
{code}
B = group A by (time_tuple, some_scalar);
C = foreach B {
	-- UDF to generate id (bytearray)
	generate id, flatten(group.$0), COUNT(A);
}
{code}

At this point the schema for C is unknown, so I declare a schema with a foreach statement

{code}
D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 as date:int,
$4 as count:int;
{code}

Even though I've declared C.$4 as an int, it is still a long (from the COUNT). When I go to
insert into HBase I get a ClassCastException since the schema (int) does not match the actual
tuple value (long). I can fix this by explicitly casting when I declare the schema.

{code}
D = foreach C generate $0 as id:bytearray, $1 as year:int, $2 as month:int, $3 as date:int,
(int)$4 as count:int;
{code}

Is this expected behavior? If not, is this an HBaseStorage issue - not honoring the schema
before going off casting things?

Cheers,
David

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message