crunch-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Wills <>
Subject Re: Question about Spark Job/Stage names
Date Tue, 29 Sep 2015 14:12:01 GMT
Hey Nithin,

I checked around about this-- apparently the stage name is hard-coded to be
the call-site of the code block that triggered the stage:

Right now, we pass the names for DoFns to the RDDs we create via
RDD.setName, but obviously that doesn't play into the stage name control.


On Mon, Sep 28, 2015 at 5:46 PM, Nithin Asokan <> wrote:

> I'm fairly new to Spark, and would like to understand about stage/job
> names when using Crunch on Spark. When I submit my Spark application, I see
> a set of stage names like *mapToPair at *I
> would like to understand if it possible by user code to update these stage
> names dynamically? Perhaps, is it possible to have DoFn names as Stage
> names?
> I did a little bit of digging and the closest thing I can find to modify
> stage name is using
> sparkContext.setCallSite(String)
> However, this updates all stage and job names to same text. I tried
> looking at MRPipeline's implementation to understand how JobNames are
> built, and I believe for SparkPipeline crunch does not create DAG and we
> don't create a job name.
> But does anyone with Spark expertise know if it's possible in Crunch to
> create job/stage names based on DoFn names?
> Thank you!
> Nithin

Director of Data Science
Cloudera <>
Twitter: @josh_wills <>

View raw message