hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefan Groschupf (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-165) caching the byte array in DataAtom can improve performance
Date Thu, 20 Mar 2008 18:47:24 GMT

     [ https://issues.apache.org/jira/browse/PIG-165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Stefan Groschupf updated PIG-165:

    Attachment: PIG-165_r639015_v1.patch

This patch caches the byte array in the DataAtom. My performance tests show a 25 % performance
improvement for read writes.
This is an big overall performance improvement for our application. 
The higher memory usage is aceptable. 

> caching the byte array in DataAtom can improve performance
> ----------------------------------------------------------
>                 Key: PIG-165
>                 URL: https://issues.apache.org/jira/browse/PIG-165
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Stefan Groschupf
>            Assignee: Stefan Groschupf
>            Priority: Critical
>         Attachments: PIG-165_r639015_v1.patch
> Many fields are passed through a processing step without changing there values. So pig
basically just read and write them.
> So read write performance is critical.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message