hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Olga Natkovich (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1102) Collect number of spills per job
Date Mon, 21 Dec 2009 21:27:18 GMT

    [ https://issues.apache.org/jira/browse/PIG-1102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12793369#action_12793369

Olga Natkovich commented on PIG-1102:

A few questions/comments on the patch:

(1) I think the count should default to 0, not -1.
(2) Does increment of count have to be combined with warn statement. Does this mean that users
will see this many warnings? If so, should we combine this with spill message we already print?
(3) I thought we discussed having increment per buffer not per record and to approximate that
based on the buffer size. I did not see the code that did this.
(4) I don't think you correctly separated bags that practively spill vs the bags that are
spilled by memory manager. All the bags created by DefaultBagFactory get registerf with SpillableMemoryManager
and belong to the second category.

> Collect number of spills per job
> --------------------------------
>                 Key: PIG-1102
>                 URL: https://issues.apache.org/jira/browse/PIG-1102
>             Project: Pig
>          Issue Type: Improvement
>            Reporter: Olga Natkovich
>            Assignee: Sriranjan Manjunath
>             Fix For: 0.7.0
>         Attachments: PIG_1102.patch
> Memory shortage is one of the main performance issues in Pig. Knowing when we spill do
the disk is useful for understanding query performance and also to see how certain changes
in Pig effect that.
> Other interesting stats to collect would be average CPU usage and max mem usage but I
am not sure if this information is easily retrievable.
> Using Hadoop counters for this would make sense.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message