hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Antonio Magnaghi (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-32) Abstraction Layer to decouple Pig from Back-End
Date Thu, 31 Jan 2008 23:27:07 GMT

     [ https://issues.apache.org/jira/browse/PIG-32?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Antonio Magnaghi updated PIG-32:
--------------------------------

    Attachment: TEST.LOG
                PATCH.2008.01.31

I have isolated the problem.

During the compilation process of MR jobs, in some instances (like when a logical operator
is an LOEval: in the case of TestPigSplit we have a long chain of 500 LOEval's) the copy method
is called on the compiled input. The copy method performs a copy via serialization/deserialization
of the input MR job. 

In the current tree represenation that we are using, each physical operator contains a pointer
to the global table of physical operators that define the operator tree. In the initial implementation,
the copy method in the Abstraction Layer patch was not avoiding a useless serialization/deserialization
of the opTable.

In this specific test case, this was causing a significant time overhead.

I have attahced a patch that fixes the problem.

The unit tests pass and the unit test logs attached show execution times that seem to be in
line with the execution times before the AL patch.

I have also check that the regression tests still pass:
=== Regression test results ===
tail /tmp/miners_test_harness_log_1201817146

[...]
Results so far, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Final results, PASSED: 102 FAILED: 0 ABORTED: 0 FAILED DEPENDENCY: 0
Finished test run at 1201821034


> Abstraction Layer to decouple Pig from Back-End
> -----------------------------------------------
>
>                 Key: PIG-32
>                 URL: https://issues.apache.org/jira/browse/PIG-32
>             Project: Pig
>          Issue Type: New Feature
>          Components: impl
>            Reporter: Antonio Magnaghi
>            Assignee: Antonio Magnaghi
>         Attachments: 2008.01.29.patch, DataStorage.diff, DataStorage20071212.diff, patch.2008.01.16.merge_w_trunk,
patch.2008.01.23.diff, PATCH.2008.01.31, patch2007_12_26.diff, patch2007_12_26_II.diff, patch2007_12_27.diff,
pig.jar, pig.jar.2008.01.16, TEST.LOG, TEST.LOGS
>
>
> I'm opening a new issue to track the development work to support an abstraction layer
for Pig as defined at http://wiki.apache.org/pig/PigAbstractionLayer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message