hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thejas M Nair (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1447) Tune memory usage of InternalCachedBag
Date Thu, 19 Aug 2010 16:14:16 GMT

    [ https://issues.apache.org/jira/browse/PIG-1447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900332#action_12900332

Thejas M Nair commented on PIG-1447:

bq. Did you see any perf improvement? 
No, the query is the same and the performance is the same, just that the number of records
reported earlier were not correct. Infact there was also a mistake in the calculation, i have
fixed that in updated patch for PIG-1524 .

I made further modifications to the L15_modified.pig to use larger columns - L15_modified2.pig
(attached). With this query the number of records dumped are 17.5 million with 0.1f and 20
million  with 0.2f for pig.cachedbag.memusage . The records are also much larger in size .
I see around 10% improvement with 0.2f .

Considering the issue in PIG-1544 and that multi-query optimized queries can have large number
of bags, I think it is safer to leave the value at 10% for now. We can add documentation on
adjusting the value of this property so that users can adjust it if they see lot of records
being proactive-spilled .

We should revisit this once PIG-1544 is fixed.

> Tune memory usage of InternalCachedBag
> --------------------------------------
>                 Key: PIG-1447
>                 URL: https://issues.apache.org/jira/browse/PIG-1447
>             Project: Pig
>          Issue Type: Improvement
>          Components: impl
>    Affects Versions: 0.7.0
>            Reporter: Daniel Dai
>            Assignee: Thejas M Nair
>             Fix For: 0.8.0
>         Attachments: L15_modified.pig, L15_modified2.pig
> We need to find a better value for "pig.cachedbag.memusage".

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message