hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mahadev konar (JIRA)" <j...@apache.org>
Subject [jira] Updated: (MAPREDUCE-1943) Implement limits on per-job JobConf, Counters, StatusReport, Split-Sizes
Date Wed, 14 Jul 2010 22:25:50 GMT

     [ https://issues.apache.org/jira/browse/MAPREDUCE-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Mahadev konar updated MAPREDUCE-1943:

    Attachment: MAPREDUCE-1521-0.20-yahoo.patch

this patch imposes some limits.

the following are the limits it imposes:

1) The number of counters per group is limited to 40. If the counters increase that amount
they are dropped silently.
2) The number of counter groups is restricted to 40. Again if the groups are more than the
limit they are dropped silently.
3) The string size of counter name is restricted to 64 characters.
4) the string size of group name is restricted to 128 characters.
5) The number of block locations returned by a split is restricted to 100, this can be changed
with a configuration parameter. 
6) limit the reporter.setstatus() string size to 512 characters.

I havent added tests yet. Will upload one shortly. Also, this patch is for yahoo 0.20 branch.
I will upload one for the trunk shortly.

> Implement limits on per-job JobConf, Counters, StatusReport, Split-Sizes
> ------------------------------------------------------------------------
>                 Key: MAPREDUCE-1943
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1943
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Mahadev konar
>            Assignee: Mahadev konar
>         Attachments: MAPREDUCE-1521-0.20-yahoo.patch
> We have come across issues in production clusters wherein users abuse counters, statusreport
messages and split sizes. One such case was when one of the users had 100 million counters.
This leads to jobtracker going out of memory and being unresponsive. In this jira I am proposing
to put sane limits on the status report length, the number of counters and the size of block
locations returned by the input split. 

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message