hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Santhosh Srinivasan" <...@yahoo-inc.com>
Subject RE: Determining the group-by column
Date Tue, 17 Feb 2009 05:03:51 GMT
Cogroup has inner plans that compute the group by attributes. Instead of
looking at the predecessor(s), you should navigate the inner plan of
cogroup. Check out the code in
src/org/apache/pig/impl/logicalLayer/validators/TypeCheckingVisitor.java
(visit(LOCogroup ...) method)

Santhosh

-----Original Message-----
From: Dmitriy Ryaboy [mailto:dvryaboy@gmail.com] 
Sent: Sunday, February 15, 2009 9:07 PM
To: pig-dev@hadoop.apache.org
Subject: Determining the group-by column

We are working on the Pig Logical Optimizer, and running into some
difficulty navigating the plan.

If we run explain on a query with a CoGroup, we get something like:

Cogroup
|    |
|    |-- Project [0]
|
|------ ForEach
            |  <etc>

What we want to do is determine that this particular Cogroup operates on
a
projection of field 0.

If we create a new LogicalTransformer that is applied to Cogroup
operators,
and call

mPlan.getPredecessors(ourCogroupOperator) , we only get the ForEach.
Calling getSuccessors results in a null being returned (Cogroup is
indeed
the root).

How do we find the Project operator above? What is its relationship,
plan-wise, with the Cogroup operator?

Thanks a lot,

Dmitriy Ryaboy, Ashutosh Chauhan, Tejal Desai

Mime
View raw message