spark-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sachin Aggarwal (JIRA)" <>
Subject [jira] [Updated] (SPARK-14597) Streaming Listener timing metrics should include time spent in JobGenerator's graph.generateJobs
Date Mon, 02 May 2016 11:48:12 GMT


Sachin Aggarwal updated SPARK-14597:
    Attachment: withSortByKey.png

> Streaming Listener timing metrics should include time spent in JobGenerator's graph.generateJobs
> ------------------------------------------------------------------------------------------------
>                 Key: SPARK-14597
>                 URL:
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, Streaming
>    Affects Versions: 1.6.1, 2.0.0
>            Reporter: Sachin Aggarwal
>            Priority: Minor
>         Attachments: WithOutSortByKey.png, withSortByKey.png
> While looking to tune our streaming application, the piece of info we were looking for
was actual processing time per batch. The StreamingListener.onBatchCompleted event provides
a BatchInfo object that provided this information. It provides the following data
>  - processingDelay
>  - schedulingDelay
>  - totalDelay
>  - Submission Time
>  The above are essentially calculated from the streaming JobScheduler clocking the processingStartTime
and processingEndTime for each JobSet. Another metric available is submissionTime which is
when a Jobset was put on the Streaming Scheduler's Queue. 
> So we took processing delay as our actual processing time per batch. However to maintain
a stable streaming application, we found that the our batch interval had to be a little less
than DOUBLE of the processingDelay metric reported. (We are using a DirectKafkaInputStream).
On digging further, we found that processingDelay is only clocking time spent in the ForEachRDD
closure of the Streaming application and that JobGenerator's graph.generateJobs (
method takes a significant more amount of time.
>  Thus a true reflection of processing time is
>  a - Time spent in JobGenerator's Job Queue (JobGeneratorQueueDelay)
>  b - Time spent in JobGenerator's graph.generateJobs (JobSetCreationDelay)
>  c - Time spent in JobScheduler Queue for a Jobset (existing schedulingDelay metric)
>  d - Time spent in Jobset's job run (existing processingDelay metric)
>  Additionally a JobGeneratorQueue delay (#a) could be due to either graph.generateJobs
taking longer than batchInterval or other JobGenerator events like checkpointing adding up
time. Thus it would be beneficial to report time taken by the checkpointing Job as well

This message was sent by Atlassian JIRA

To unsubscribe, e-mail:
For additional commands, e-mail:

View raw message