hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dmitriy V. Ryaboy (JIRA)" <j...@apache.org>
Subject [jira] Commented: (PIG-1427) Monitor and kill runaway UDFs
Date Fri, 28 May 2010 15:21:37 GMT

    [ https://issues.apache.org/jira/browse/PIG-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873008#action_12873008
] 

Dmitriy V. Ryaboy commented on PIG-1427:
----------------------------------------

Uhm so my timing code had a bug in it -- I didn't actually monitor the monitored UDF. Hence
the numbers indicating that there is no overhead at all.

Here are the real numbers (recall that the function itself doesn't really do any work, so
we are seeing fairly pure timing of the overhead here):

{noformat}
Warming up.
Warmed up. Timing.
Reps: 1000 monitored: 20 unmonitored: 0
Reps: 10000 monitored: 242 unmonitored: 1
Reps: 100000 monitored: 1966 unmonitored: 5
Reps: 1000000 monitored: 19773 unmonitored: 59
Reps: 1000 monitored: 16 unmonitored: 0
Reps: 10000 monitored: 217 unmonitored: 0
Reps: 100000 monitored: 1924 unmonitored: 5
Reps: 1000000 monitored: 19139 unmonitored: 57
Reps: 1000 monitored: 27 unmonitored: 0
Reps: 10000 monitored: 248 unmonitored: 0
Reps: 100000 monitored: 2104 unmonitored: 6
Reps: 1000000 monitored: 18756 unmonitored: 57
Reps: 1000 monitored: 17 unmonitored: 1
Reps: 10000 monitored: 211 unmonitored: 1
Reps: 100000 monitored: 1940 unmonitored: 5
Reps: 1000000 monitored: 18619 unmonitored: 59
Reps: 1000 monitored: 21 unmonitored: 0
Reps: 10000 monitored: 230 unmonitored: 1
Reps: 100000 monitored: 2088 unmonitored: 7
Reps: 1000000 monitored: 18526 unmonitored: 57
{noformat}

As you can see the cost is fairly consistent -- 2 milliseconds per 100 invocations, or to
put it another way, 20 seconds per million invocations.

> Monitor and kill runaway UDFs
> -----------------------------
>
>                 Key: PIG-1427
>                 URL: https://issues.apache.org/jira/browse/PIG-1427
>             Project: Pig
>          Issue Type: New Feature
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>         Attachments: monitoredUdf.patch, monitoredUdf.patch
>
>
> As a safety measure, it is sometimes useful to monitor UDFs as they execute. It is often
preferable to return null or some other default value instead of timing out a runaway evaluation
and killing a job. We have in the past seen complex regular expressions lead to job failures
due to just half a dozen (out of millions) particularly obnoxious strings.
> It would be great to give Pig users a lightweight way of enabling UDF monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message