incubator-hcatalog-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Travis Crawford (JIRA)" <>
Subject [jira] [Commented] (HCATALOG-425) Pig cannot read/write SMALLINT/TINYINT columns
Date Thu, 30 Aug 2012 17:41:08 GMT


Travis Crawford commented on HCATALOG-425:

Ideally we can keep things simple, and avoid cases where HCatSchema/HCatRecord differ. Let's
walk through an example reading with Pig.

Initially, Pig is going to ask HCat for the schema of the relation being loaded. This means
querying the metastore, converting the table schema into an hcat schema, then converting the
hcat schema into a pig schema. If we implement conversions in the hive-->hcat schema layer,
pig always sees records in data types it has support for.

Now pig reads a record through HCat. HCat reads a record from the hive serde, and converts
to an hcat record using whatever conversion rules have been enabled. This record is converted
to a pig tuple that matches the expected schema.

Now let's write something. Pig will provide a tuple that we need to write into a table that
might have a different schema. When converting the pig tuple into an hcat record, we apply
conversion rules "on the way out" so that our hcat record and hcat schema match.

I believe if we follow this approach the schema and records will always match, and we can
avoid having to keep track of original data types, if fields have been converted, etc. I do
agree if we need lots of these a "conversion strategy impl" would start to make sense. I'm
not sure we'll get to that place though - there are just a handful of conversion I know about.
> Pig cannot read/write SMALLINT/TINYINT columns
> ----------------------------------------------
>                 Key: HCATALOG-425
>                 URL:
>             Project: HCatalog
>          Issue Type: Bug
>          Components: pig
>    Affects Versions: 0.4
>            Reporter: Thejas M Nair
>            Assignee: Travis Crawford
>             Fix For: 0.5
>         Attachments: HCATALOG-425_small_tiny_int.1.patch, HCATALOG-425_small_tiny_int.2.patch,
> Currently throw exception. We can always allow read and on write side, we can do out
of boundary check at runtime.
> This issue described in  HCATALOG-168, has not been fixed. It still throws an exception.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message