spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steve Loughran <>
Subject Re: spark multi tenancy
Date Wed, 07 Oct 2015 09:34:05 GMT

> On 7 Oct 2015, at 09:26, Dominik Fries <> wrote:
> Hello Folks,
> We want to deploy several spark projects and want to use a unique project
> user for each of them. Only the project user should start the spark
> application and have the corresponding packages installed. 
> Furthermore a personal user, which belongs to a specific project, should
> start a spark application via the corresponding spark project user as proxy.
> (Development)
> The Application is currently running with ipython / pyspark. (HDP 2.3 -
> Spark 1.3.1)
> Is this possible or what is the best practice for a spark multi tenancy
> environment ?

Deploy on a kerberized YARN cluster and each application instance will be running as a different
unix user in the cluster, with the appropriate access to HDFS —isolated.

The issue then becomes "do workloads clash with each other?". If you want to isolate dev &
production, using node labels to keep dev work off the production nodes is the standard technique.
View raw message