spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Egor Pahomov <>
Subject Re: Get size of intermediate results
Date Fri, 21 Oct 2016 01:18:50 GMT
I needed the same for debugging and I just added "count" action in debug
mode for every step I was interested in. It's very time-consuming, but I
debug not very often.

2016-10-20 2:17 GMT-07:00 Andreas Hechenberger <>:

> Hey awesome Spark-Dev's :)
> i am new to spark and i read a lot but now i am stuck :( so please be
> kind, if i ask silly questions.
> I want to analyze some algorithms and strategies in spark and for one
> experiment i want to know the size of the intermediate results between
> iterations/jobs. Some of them are written to disk and some are in the
> cache, i guess. I am not afraid of looking into the code (i already did)
> but its complex and have no clue where to start :( It would be nice if
> someone can point me in the right direction or where i can find more
> information about the structure of spark core devel :)
> I already setup the devel environment and i can compile spark. It was
> really awesome how smoothly the setup was :) Thx for that.
> Servus
> Andy
> ---------------------------------------------------------------------
> To unsubscribe e-mail:


*Sincerely yoursEgor Pakhomov*

View raw message