hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1501) need to investigate the impact of compression on pig performance
Date Wed, 01 Sep 2010 00:30:53 GMT

    [ https://issues.apache.org/jira/browse/PIG-1501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12904848#action_12904848
] 

Olga Natkovich commented on PIG-1501:
-------------------------------------

Ashutosh,

The reason it is off by default is because the default compression is gzip which is really
slow and most of the time not what you want. Because of the licensing issue with lzo, users
need to setup it on their own. Once they do the setup, they can enable the compression.

> need to investigate the impact of compression on pig performance
> ----------------------------------------------------------------
>
>                 Key: PIG-1501
>                 URL: https://issues.apache.org/jira/browse/PIG-1501
>             Project: Pig
>          Issue Type: Test
>            Reporter: Olga Natkovich
>            Assignee: Yan Zhou
>             Fix For: 0.8.0
>
>         Attachments: compress_perf_data.txt, compress_perf_data_2.txt, PIG-1501.patch,
PIG-1501.patch, PIG-1501.patch
>
>
> We would like to understand how compressing map results as well as well as reducer output
in a chain of MR jobs impacts performance. We can use PigMix queries for this investigation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message