pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rohini Palaniswamy (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2433) Jython import module not working if module path is in classpath
Date Mon, 07 Jan 2013 02:16:13 GMT

    [ https://issues.apache.org/jira/browse/PIG-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545583#comment-13545583

Rohini Palaniswamy commented on PIG-2433:

   Suspecting that the following code execution is failing for you based on the stack trace.
But the attached log does not have any error and the comment also says it will fail silently.

// attempt addition of schema decorator handler, fail silently
                interpreter.exec("def outputSchema(schema_def):\n"
                        + "    def decorator(func):\n"
                        + "        func.outputSchema = schema_def\n"
                        + "        return func\n"
                        + "    return decorator\n\n");

Test ran fine for me in Mac and RHEL 5. I will see if I can try and reproduce. Can you add
org.python.core.Options.verbose = Py.DEBUG; in the static block of JythonScriptEngine and
see if that gives any other additional error messages for you? 
> Jython import module not working if module path is in classpath
> ---------------------------------------------------------------
>                 Key: PIG-2433
>                 URL: https://issues.apache.org/jira/browse/PIG-2433
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.10.0
>            Reporter: Daniel Dai
>            Assignee: Rohini Palaniswamy
>             Fix For: 0.12
>         Attachments: PIG-2433.patch, TEST-org.apache.pig.test.TestScriptUDF.txt
> This is a hole of PIG-1824. If the path of python module is in classpath, job die with
the message could not instantiate 'org.apache.pig.scripting.jython.JythonFunction'.
> Here is my observation:
> If the path of python module is in classpath, fileEntry we got in JythonScriptEngine:236
is __pyclasspath__/script$py.class instead of the script itself. Thus we cannot locate the
script and skip the script in job.xml. 
> For example:
> {code}
> register 'scriptB.py' using org.apache.pig.scripting.jython.JythonScriptEngine as pig
> A = LOAD 'table_testPythonNestedImport' as (a0:long, a1:long);
> B = foreach A generate pig.square(a0);
> dump B;
> scriptB.py:
> #!/usr/bin/python
> import scriptA
> @outputSchema("x:{t:(num:double)}")
> def sqrt(number):
>  return (number ** .5)
> @outputSchema("x:{t:(num:long)}")
> def square(number):
>  return long(scriptA.square(number))
> scriptA.py:
> #!/usr/bin/python
> def square(number):
>  return (number * number)
> {code}
> When we register scriptB.py, we use jython library to figure out the dependent modules
scriptB relies on, in this case, scriptA. However, if current directory is in classpath, instead
of scriptA.py, we get __pyclasspath__/scriptA.class. Then we try to put __pyclasspath__/script$py.class
into job.jar, Pig complains __pyclasspath__/script$py.class does not exist. 
> This is exactly TestScriptUDF.testPythonNestedImport is doing. In hadoop 20.x, the test
still success because MiniCluster will take local classpath so it can still find scriptA.py
even if it is not in job.jar. However, the script will fail in real cluster and MiniMRYarnCluster
of hadoop 23.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message