pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Cheolsoo Park (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (PIG-3288) Kill jobs if the number of output files is over a configurable limit
Date Mon, 22 Apr 2013 21:27:17 GMT

     [ https://issues.apache.org/jira/browse/PIG-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Cheolsoo Park updated PIG-3288:
-------------------------------

    Attachment: PIG-3288.patch

Attached is a patch that I implemented the following:
* I added a new property called {{pig.exec.hdfs.files.max.limit}}.
* When this property is enabled, MRLauncher monitors a counter ({{CREATED_FILES_COUTNER}})
periodically.
* Since how many files are created by a mapper/reducer is RecordWriter-specific, each storage
is responsible for increasing the counter properly. As a reference example, I added code that
increases the counter in {{PigLineRecordWriter}} for PigStorage.
                
> Kill jobs if the number of output files is over a configurable limit
> --------------------------------------------------------------------
>
>                 Key: PIG-3288
>                 URL: https://issues.apache.org/jira/browse/PIG-3288
>             Project: Pig
>          Issue Type: Wish
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 0.12
>
>         Attachments: PIG-3288.patch
>
>
> I ran into a situation where a Pig job tried to create too many files on hdfs and overloaded
NN. To prevent such events, it would be nice if we could set a upper limit on the number of
files that a Pig job can create.
> In fact, Hive has a property called "hive.exec.max.created.files". The idea is that each
mapper/reducer increases a counter every time when they create files. Then, MRLauncher periodically
checks whether the number of created files so far has exceeded the upper limit. If so, we
kill running jobs and exit.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message