pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2643) Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString, etc
Date Tue, 10 Apr 2012 19:29:13 GMT

    [ https://issues.apache.org/jira/browse/PIG-2643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250987#comment-13250987
] 

Dmitriy V. Ryaboy commented on PIG-2643:
----------------------------------------

I'm a bit confused about the math keyword stuff. I meant just generically being able to invoke
(or generate code that wraps) methods on objects, the way one would initialize an object inside
a UDF constructor and then call methods on that method during UDF exec methods.
                
> Use bytecode generation to make a performance replacement for InvokeForLong, InvokeForString,
etc
> -------------------------------------------------------------------------------------------------
>
>                 Key: PIG-2643
>                 URL: https://issues.apache.org/jira/browse/PIG-2643
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Jonathan Coveney
>            Assignee: Jonathan Coveney
>            Priority: Minor
>              Labels: codegen
>             Fix For: 0.11, 0.10.1
>
>         Attachments: PIG-2643-0.patch
>
>
> This is basically to cut my teeth for much more ambitious code generation down the line,
but I think it could be performance and useful.
> the new syntax is:
> {code}a = load 'thing' as (x:chararray);
> define concat InvokerGenerator('java.lang.String','concat','String');
> define valueOf InvokerGenerator('java.lang.Integer','valueOf','String');
> define valueOfRadix InvokerGenerator('java.lang.Integer','valueOf','String,int');
> b = foreach a generate x, valueOf(x) as vOf;
> c = foreach b generate x, vOf, valueOfRadix(x, 16) as vOfR;
> d = foreach c generate x, vOf, vOfR, concat(concat(x, (chararray)vOf), (chararray)vOfR);
> dump d;
> {code}
> There are some differences between this version and Dmitriy's implementation:
> - it is no longer necessary to declare whether the method is static or not. This is gleaned
via reflection.
> - as per the above, it is no longer necessary to make the first argument be the type
of the object to invoke the method on. If it is not a static method, then the type will implicitly
be the type you need. So in the case of concat, it would need to be passed a tuple of two
inputs: one for the method to be called against (as it is not static), and then the 'string'
that was specified. In the case of valueOf, because it IS static, then the 'String' is the
only value.
> - The arguments are type sensitive. Integer means the Object Integer, whereas int (or
long, or float, or boolean, etc) refer to the primitive. This is necessary to properly reflect
the arguments. Values passed in WILL, however, be properly unboxed as necessary.
> - The return type will be reflected.
> This uses the ASM API to generate the bytecode, and then a custom classloader to load
it in. I will add caching of the generated code based on the input strings, etc, but I wanted
to get eyes and opinions on this. I also need to benchmark, but it should be native speed
(excluding a little startup time to make the bytecode, but ASM is really fast).
> Another nice benefit is that this bypasses the need for the JDK, though it adds a dependency
on ASM (which is a super tiny dependency).
> Patch incoming.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message