hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-660) Integration with Hadoop 0.20
Date Wed, 05 Aug 2009 02:44:15 GMT

     [ https://issues.apache.org/jira/browse/PIG-660?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Dmitriy V. Ryaboy updated PIG-660:

    Attachment: pig_660_shims.patch

Attached patch, pig_660_shims.patch, introduces an compatibility layer similar to that in
https://issues.apache.org/jira/browse/HIVE-487 . HadoopShims.java contains wrappers that hide
interface differences between Hadoop 18 and 20; when an interface change affects Pig, a shim
is added into this class, and used by Pig.

Separate versions of the shims are maintained for different Hadoop versions.

This way, Pig users can compile against either Hadoop 18 or Hadoop 20 by simply changing an
ant property, either via the -D flag, or build.properties, instead of having to go through
the process of patching.

There has been discussion of officially moving Pig to 0.20; this way, we sidestep the whole
question, and only need to worry about version compatibility when using specific Hadoop APIs.

I propose that we use this mechanism until Pig is moved to use the new, future-proofed API.

Pig compiled against 18 won't be able to use some of the newest features, such as Zebra storage.
Ant can be configured not to build ant if Hadoop version is < 20.

> Integration with Hadoop 0.20
> ----------------------------
>                 Key: PIG-660
>                 URL: https://issues.apache.org/jira/browse/PIG-660
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.2.0
>         Environment: Hadoop 0.20
>            Reporter: Santhosh Srinivasan
>            Assignee: Santhosh Srinivasan
>             Fix For: 0.4.0
>         Attachments: PIG-660-for-branch-0.3.patch, PIG-660.patch, PIG-660_1.patch, PIG-660_2.patch,
PIG-660_3.patch, PIG-660_4.patch, PIG-660_5.patch, pig_660_shims.patch
> With Hadoop 0.20, it will be possible to query the status of each map and reduce in a
map reduce job. This will allow better error reporting. Some of the other items that could
be on Hadoop's feature requests/bugs are documented here for tracking.
> 1. Hadoop should return objects instead of strings when exceptions are thrown
> 2. The JobControl should handle all exceptions and report them appropriately. For example,
when the JobControl fails to launch jobs, it should handle exceptions appropriately and should
support APIs that query this state, i.e., failure to launch jobs.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message