pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2713) Pig query planner throwing parse error on Joins
Date Wed, 06 Jun 2012 05:33:23 GMT

    [ https://issues.apache.org/jira/browse/PIG-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13289960#comment-13289960

Dmitriy V. Ryaboy commented on PIG-2713:

Harsh: "." is an operator that dereferences inside a tuple (or generates a projection, for
bags). "::" is just two colons. If you do a "describe" on the relation, you will see that
the field names after a join are made up of the name of the original relation, "::", and the
original field name.

Pig does some disambiguation when possible, so that if you say "bar" and there is a field
called "foo::bar", it will know what you are talking about -- as long as there is only one
bar, of course.

The issue is getting a bit confused by the fact that pig allows you to treat relations as
scalars if they only have 1 row. That means that while normally a "foreach" operator only
works on one relation, you can actually say "foreach some_relation generate some_field / some_other_relation.some_other_field".
In that case, pig figures you mean "some_other_relation" is a single-row relation, pull out
"some_other_field" from it, and pull it in here. It's a convenience syntax since most people
don't think of doing a replicated join to do the same thing. Sadly, after this feature was
released, we have discovered that people often make grammatical mistakes like the one you
and Bejoy are making, which would have failed fast before, quickly leading to identifying
a resolution, but now fail in odd ways since Pig thinks you are trying to treat some other
relation as a scalar -- rather than realizing you are just referencing fields incorrectly.
We have a ticket open to fix this problem and revert to a fail-fast mode.
> Pig query planner throwing parse error on Joins 
> ------------------------------------------------
>                 Key: PIG-2713
>                 URL: https://issues.apache.org/jira/browse/PIG-2713
>             Project: Pig
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.8.1, 0.9.2
>         Environment: CentOS 6
>            Reporter: Bejoy KS
> Pig parser is throwing an exception when two columns in a table has the same name and
when they are used as part of some projection operation after join.
> Error message
> ERROR 1103: Merge join/Cogroup only supports Filter, Foreach, filter and Load as its
predecessor. Found :
> Error would be thrown for common join as well.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message