pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Coveney (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2632) Create a SchemaTuple which generates efficient Tuples via code gen
Date Tue, 10 Apr 2012 17:19:14 GMT

    [ https://issues.apache.org/jira/browse/PIG-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250837#comment-13250837
] 

Jonathan Coveney commented on PIG-2632:
---------------------------------------

I agree 100% that it's time to iron out kinks. Just trying to figure out what those kinks
are, realy.

As far as Varint, I went ahead and just included the logic directly. It' 8 ~3 line functions,
not worth the dependency.
                
> Create a SchemaTuple which generates efficient Tuples via code gen
> ------------------------------------------------------------------
>
>                 Key: PIG-2632
>                 URL: https://issues.apache.org/jira/browse/PIG-2632
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Jonathan Coveney
>            Assignee: Jonathan Coveney
>             Fix For: 0.11
>
>         Attachments: PIG-2632-0.patch, PIG-2632-1.patch, PIG-2632-3.patch
>
>
> This work builds on Dmitriy's PrimitiveTuple work. The idea is that, knowing the Schema
on the frontend, we can code generate Tuples which can be used for fun and profit. In rudimentary
tests, the memory efficiency is 2-4x better, and it's ~15% smaller serialized (heavily heavily
depends on the data, though). Need to do get/set tests, but assuming that it's on par (or
even faster) than Tuple, the memory gain is huge.
> Need to clean up the code and add tests.
> Right now, it generates a SchemaTuple for every inputSchema and outputSchema given to
UDF's. The next step is to make a SchemaBag, where I think the serialization savings will
be really huge.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message