hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Updated: (PIG-1427) Monitor and kill runaway UDFs
Date Wed, 26 May 2010 07:40:33 GMT

     [ https://issues.apache.org/jira/browse/PIG-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Dmitriy V. Ryaboy updated PIG-1427:
-----------------------------------

    Attachment: monitoredUdf.patch

The attached patch is a basic sketch of the proposed implementation. It uses the guava library
( http://code.google.com/p/guava-libraries/ ). I tested with r03, but I see that r04 is out
now and may be preferable. The real patch will include the appropriate ivy changes, as well
as all the apache headers and other niceties.

The idea is to create a @MonitoredUDF annotation that a udf author can add to the EvalFunc.
If such an annotation is seen on the eval func, its evaluation is wrapped in a java Future,
executed in a separate thread, and monitored with a timeout.

The most basic usage is possible even now -- just add @MonitoredUDF to EvalFuncs class definitions
you expect might time out, and try it. For ease of testing, one can set the timeout interval
to the millisecond level.

This is based heavily on Florian Leibert's implementation of the same concept.

Please take a look and comment.


> Monitor and kill runaway UDFs
> -----------------------------
>
>                 Key: PIG-1427
>                 URL: https://issues.apache.org/jira/browse/PIG-1427
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: monitoredUdf.patch
>
>
> As a safety measure, it is sometimes useful to monitor UDFs as they execute. It is often
preferable to return null or some other default value instead of timing out a runaway evaluation
and killing a job. We have in the past seen complex regular expressions lead to job failures
due to just half a dozen (out of millions) particularly obnoxious strings.
> It would be great to give Pig users a lightweight way of enabling UDF monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message