pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pi Song (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-114) store one alias/logicalPlan twice leads to instantiation of StoreFunc as LoadFunc
Date Wed, 09 Apr 2008 12:35:24 GMT

    [ https://issues.apache.org/jira/browse/PIG-114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12587169#action_12587169
] 

Pi Song commented on PIG-114:
-----------------------------

Alan, 

That block is basically just a caching hook-up, if the function is not reversible then it
will fall through to below which is actually compiling the operator and use it without reading
the cached output. My opinion toward the new plan compilation is to move something like this
to the optimization stage as an optional filter.

Here is the block including its context:-
{code}
    public OperatorKey compile(OperatorKey logicalKey, 
                               Map<OperatorKey, LogicalOperator> logicalOpTable, 
                               HExecutionEngine execEngine) throws IOException {
        
        // check to see if we have materialized results for the logical tree to
        // compile, if so, re-use them...
        //
        Map<OperatorKey, MapRedResult> materializedResults = execEngine.getMaterializedResults();
        
        MapRedResult materializedResult = materializedResults.get(logicalKey);
        
        if ( (materializedResult != null) && 
             (PigContext.instantiateFuncFromSpec(materializedResult.outFileSpec.getFuncSpec())

                                                           instanceof ReversibleLoadStoreFunc)
 )  {
            	POMapreduce pom = new POMapreduce(logicalKey.getScope(),
                             	                 nodeIdGenerator.getNextNodeId(logicalKey.getScope()),
                                	              execEngine.getPhysicalOpTable(),
                                	              logicalKey,
                                	              pigContext);

           		pom.addInputFile(materializedResult.outFileSpec);
            	pom.mapParallelism = Math.max(pom.mapParallelism, materializedResult.parallelismRequest);

            	return pom.getOperatorKey();           
        }
        
        // first, compile inputs into MapReduce operators
        OperatorKey[] compiledInputs = new OperatorKey[logicalOpTable.get(logicalKey).getInputs().size()];
        
        for (int i = 0; i < logicalOpTable.get(logicalKey).getInputs().size(); i++)
            compiledInputs[i] = compile(logicalOpTable.get(logicalKey).getInputs().get(i),
                                        logicalOpTable,
                                        execEngine);
        
        // then, compile this operator; if possible, merge with previous MapReduce
        // operator rather than introducing a new one
        
        LogicalOperator lo = logicalOpTable.get(logicalKey);
        
        if (lo instanceof LOEval) {
            POMapreduce pom = ((POMapreduce)execEngine.getPhysicalOpTable().get(compiledInputs[0]))
                                .copy(nodeIdGenerator.getNextNodeId(logicalKey.getScope()));
// make a copy of the previous

        // More and more and more plan compilation here
{code}

> store one alias/logicalPlan twice leads to instantiation of StoreFunc as LoadFunc
> ---------------------------------------------------------------------------------
>
>                 Key: PIG-114
>                 URL: https://issues.apache.org/jira/browse/PIG-114
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.0.0
>            Reporter: Johannes Zillmann
>            Assignee: Pi Song
>             Fix For: 0.1.0
>
>         Attachments: PIG114_FixOptimize1.patch, PIG114_FixOptimize_Sample.patch, pigPatch-storeTwice-620665.patch
>
>
> Calling PigServer#store() twice for an alias results in following exception :
> {noformat}
> java.lang.RuntimeException: java.lang.ClassCastException: org.apache.pig.test.DummyStoreFunc
cannot be cast to org.apache.pig.LoadFunc
> 	at org.apache.pig.backend.local.executionengine.POLoad.<init>(POLoad.java:59)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.doCompile(LocalExecutionEngine.java:167)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.doCompile(LocalExecutionEngine.java:184)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.doCompile(LocalExecutionEngine.java:184)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.compile(LocalExecutionEngine.java:111)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.compile(LocalExecutionEngine.java:90)
> 	at org.apache.pig.backend.local.executionengine.LocalExecutionEngine.compile(LocalExecutionEngine.java:1)
> 	at org.apache.pig.PigServer.store(PigServer.java:330)
> 	at org.apache.pig.PigServer.store(PigServer.java:317)
> 	at org.apache.pig.test.StoreTwiceTest.testIt(StoreTwiceTest.java:31)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:589)
> 	at junit.framework.TestCase.runTest(TestCase.java:164)
> 	at junit.framework.TestCase.runBare(TestCase.java:130)
> 	at junit.framework.TestResult$1.protect(TestResult.java:110)
> 	at junit.framework.TestResult.runProtected(TestResult.java:128)
> 	at junit.framework.TestResult.run(TestResult.java:113)
> 	at junit.framework.TestCase.run(TestCase.java:120)
> 	at junit.framework.TestSuite.runTest(TestSuite.java:228)
> 	at junit.framework.TestSuite.run(TestSuite.java:223)
> 	at org.junit.internal.runners.OldTestClassRunner.run(OldTestClassRunner.java:35)
> 	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:45)
> 	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:460)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:673)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:386)
> 	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)
> Caused by: java.lang.ClassCastException: org.apache.pig.test.DummyStoreFunc cannot be
cast to org.apache.pig.LoadFunc
> 	at org.apache.pig.backend.local.executionengine.POLoad.<init>(POLoad.java:57)
> 	... 28 more
> {noformat}
> I will attach a patch with a test scenario for this. Basically the code is as follow:
> {noformat}PigServer pig = new PigServer(ExecType.LOCAL);
>         pig
>                 .registerQuery("A = LOAD 'test/org/apache/pig/test/StoreTwiceTest.java'
USING "
>                         + DummyLoadFunc.class.getName() + "();");
>         pig.registerQuery("B = FOREACH A GENERATE * ;");
>         File outputFile = new File("/tmp/testPigOutput");
>         outputFile.delete();
>         pig.store("A", outputFile.getAbsolutePath(), DummyStoreFunc.class
>                 .getName()
>                 + "()");
>         outputFile.delete();
>         pig.store("B", outputFile.getAbsolutePath(), DummyStoreFunc.class
>                 .getName()
>                 + "()");
>         outputFile.delete();
>         assertEquals(2, _storedTuples.size());
> {noformat}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message