pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PIG-2359) Support more efficient Tuples when schemas are known
Date Mon, 05 Dec 2011 01:15:40 GMT

    [ https://issues.apache.org/jira/browse/PIG-2359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162547#comment-13162547

Dmitriy V. Ryaboy commented on PIG-2359:

Very rough (read: invalid, probably) speed test: modified SUM / LongSum to use a PLongTuple,
and ran this code on excite.log from the tutorial:

l = load 'tutorial/data/excite-big.log' as (id:chararray, val:long, query:chararray);
x = foreach (group l all) generate SUM(l.val);

store x into '/tmp/foo';

Before optimization:

real	0m14.785s
user	0m22.516s
sys	0m1.203s

real	0m15.323s
user	0m22.605s
sys	0m1.182s

real	0m14.841s
user	0m22.600s
sys	0m1.176s

real	0m14.347s
user	0m20.442s
sys	0m1.095s

real	0m14.344s
user	0m20.241s
sys	0m1.064s

real	0m14.577s
user	0m20.671s
sys	0m1.087s
> Support more efficient Tuples when schemas are known
> ----------------------------------------------------
>                 Key: PIG-2359
>                 URL: https://issues.apache.org/jira/browse/PIG-2359
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: PIG-2359.1.patch, PIG-2359.2.patch
> Pig Tuples have significant overhead due to the fact that all the fields are Objects.
> When a Tuple only contains primitive fields (ints, longs, etc), it's possible to avoid
this overhead, which would result in significant memory savings.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message