pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jarek Jarcec Cecho (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-3390) Make pig working with HBase 0.95
Date Tue, 23 Jul 2013 17:28:48 GMT

     [ https://issues.apache.org/jira/browse/PIG-3390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Jarek Jarcec Cecho updated PIG-3390:
------------------------------------

    Attachment: PIG-3390.patch

I'm attaching preliminary patch that get basic support for HBase 0.95 working. I'm not changing
status to {{Patch available}} as the patch is not yet ready for commit. Nevertheless, I would
appreciate any feedback.

I've tweaked the {{ivy}} to include two HBase profiles, one for HBase 0.94- and second for
0.95+. It seems transitive dependencies of 0.95 are not currently resolved properly, so I
had to temporarily specify all of them manually (seems to be tracked by HBASE-8488).

For the missing APis:

* {{Scan.write(DataOutput)}} It seems that we used this to manually serialize the {{Scan}}
into Mapreduce job. I've used {{TableInputFormat}} to that for us. This way seems to be working
for both 0.94- and 0.95+.
* {{Mutation.setWriteToWAL(Boolean}} was superseded by {{Mutation.setDurability(Durability)}}.
Unfortunately I did not find clean way how to overcome this API change. Current patch uses
reflection to detect the HBase version and to call the proper API.

To test it out you can use following commands:

{code}
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=95
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=94 # default
{code}

Using Hadoop 2 won't currently work as the HBase artifacts for Hadoop 2 are not published,
but the in the future it should work the following way:

{code}
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=95 -Dhadoopversion=23 -Dhbasecompat=2
ant clean test -Dtestcase=TestHBaseStorage -Dhbaseversion=94 -Dhadoopversion=23 -Dhbasecompat=2
{code}

I'll be more than happy to hear any feedback on my approach!
                
> Make pig working with HBase 0.95
> --------------------------------
>
>                 Key: PIG-3390
>                 URL: https://issues.apache.org/jira/browse/PIG-3390
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jarek Jarcec Cecho
>            Assignee: Jarek Jarcec Cecho
>         Attachments: PIG-3390.patch
>
>
> The HBase 0.95 changed API in incompatible way. Following APIs that {{HBaseStorage}}
in Pig uses are no longer available:
> * {{Mutation.setWriteToWAL(Boolean)}}
> * {{Scan.write(DataOutput)}}
> Also in addition the HBase is no longer available as one monolithic archive with entire
functionality, but was broken down into smaller pieces such as {{hbase-client}}, {{hbase-server}},
...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message