pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alan Gates (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2889) HBaseAvroStorage UDF
Date Thu, 23 Aug 2012 16:05:43 GMT

    [ https://issues.apache.org/jira/browse/PIG-2889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13440406#comment-13440406

Alan Gates commented on PIG-2889:

Hive has an AvroSerDe which I believe can read schema on the fly like this.  This should just
work with HCat.  In theory this should work with HBase as well, since SerDes are independent
of IF/OF and storage handlers in Hive/HCat.  This would all need to be tested.

All that said, there's nothing to prevent you from doing it as you propose in Pig without
> HBaseAvroStorage UDF
> --------------------
>                 Key: PIG-2889
>                 URL: https://issues.apache.org/jira/browse/PIG-2889
>             Project: Pig
>          Issue Type: New Feature
>          Components: data, piggybank
>    Affects Versions: 0.11
>            Reporter: Russell Jurney
>            Assignee: Russell Jurney
>             Fix For: 0.11
> I want to use HBaseStorage without specifying the schema. Storing data in Avro format
in HBase is a very common practice. I would like to create a UDF, HBaseAvroStorage that works
just like the internal HBaseStorage UDF, but loads the Avro schema metadata so that specifying
a schema is unnecessary.
> I haven't thought through all the particulars, so if you have - please chime in :)
> I am also not sure if this isn't sort of handled some place in HCatalog?

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message