hive-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mich Talebzadeh <>
Subject Spark Streaming, Batch interval, Windows length and Sliding Interval settings
Date Wed, 04 May 2016 20:45:51 GMT

Just wanted opinions on this.

In Spark streaming the parameter

val ssc = new StreamingContext(sparkConf, Seconds(n))

defines the batch or sample interval for the incoming streams

In addition there is windows Length

// window length - The duration of the window below that must be multiple
of batch interval n in = > StreamingContext(sparkConf, Seconds(n))

val windowLength = L

And fibally the sliding interval
// sliding interval - The interval at which the window operation is

val slidingInterval = I

OK so as given the windowLength  L = multiples of n and the slidingInterval
has to be consistent to ensure that we can the head and tail of the window.

So as a heuristic approach for a batch interval of say 10 seconds, I put
the windows length at 3 times  that = 30 seconds and make the
slidinginterval = batch interval = 10.

Obviously these are subjective depending on what is being measured.
However, I believe having slidinginterval = batch interval makes sense?

Appreciate any views on this.


Dr Mich Talebzadeh

LinkedIn *

View raw message