mesos-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From John Omernik <j...@omernik.com>
Subject Re: Spark on Mesos Submitted from multiple users
Date Fri, 20 Feb 2015 16:09:59 GMT
Tim - on the Spark list your name was brought up in relation to
https://issues.apache.org/jira/browse/SPARK-5338 I asked this question
there but I'll ask it here too, what can I do to help on this. I am
not a coder unfortunately, but I am user willing to try things :) This
looks really cool for what we would like to do with Spark and Mesos
and I'd love to be able to contribute and/or get an understanding of a
(even tentative) timeline.  I am not trying to be pushy, I understand
lots of things are likely on your agenda :)

John



On Tue, Feb 17, 2015 at 6:33 AM, John Omernik <john@omernik.com> wrote:
> Tim, thanks, that makes sense, the checking for ports and incrementing
> was new to me, so hearing about that helps.  Next question.... is it
> possible, for a driver to be shared by the same user some how? This
> would be desirable from the standpoint of running an iPython notebook
> server (Jupyter Hub).  I have it setup that every time a notebook is
> opened, that the imports for spark are run, (the idea is the
> environment is ready to go for analysis) however, if each user, has 5
> notebooks open at any time, that would be a lot of spark drivers! But,
> I suppose before asking that, I should ask about the sequence of
> drivers... are they serial? i.e. can one driver server only one query
> at a time?   What is the optimal size for a driver (in memory) what
> does the memory affect in the driver? I.e. is a driver with smaller
> amounts of memory limited in the number of results etc?
>
> Lots of questions here, if these are more spark related questions, let
> me know, I can hop over to spark users, but since I am curious on
> spark on mesos, I figured I'd try here first.
>
> Thanks for your help!
>
>
>
> On Mon, Feb 16, 2015 at 10:30 AM, Tim Chen <tim@mesosphere.io> wrote:
>> Hi John,
>>
>> With Spark on Mesos, each client (spark-submit) starts a SparkContext which
>> initializes its own SparkUI and framework. There is a default 4040 for the
>> Spark UI port, but if it's occupied Spark automatically tries ports
>> incrementally for you, so your next could be 4041 if it's available.
>>
>> Driver is not shared between user, each user creates its own driver.
>>
>> About slowness it's hard to say without any information, you need to tell us
>> your cluster setup, what mode you're Mesos with and if there is anything
>> else running in the cluster, the job, etc.
>>
>> Tim
>>
>> On Sat, Feb 14, 2015 at 5:06 PM, John Omernik <john@omernik.com> wrote:
>>>
>>> Hello all, I am running Spark on Mesos and I think I am love, but I
>>> have some questions. I am running the python shell via iPython
>>> Notebooks (Jupyter) and it works great, but I am trying to figure out
>>> how things are actually submitted... like for example, when I submit
>>> the spark app from the iPython notebook server, I am opening a new
>>> kernel and I see a new spark submit (similar to the below) for each
>>> kernel... but, how is that actually working on the cluster, I can
>>> connect to the spark server UI on 4040, but shouldn't there be a
>>> different one for each driver? Is that causing conflicts? after a
>>> while things seem to run slow is this due to some weird conflicts?
>>> Should I be specifying unique ports for each server? Is the driver
>>> shared between users? what about between kerne's for the same user?
>>> Curious if anyone has any insight.
>>>
>>> Thanks!
>>>
>>>
>>> java org.apache.spark.deploy.SparkSubmitDriverBootstrapper --master
>>> mesos://hadoopmapr3:5050 --driver-memory 1G --executor-memory 4096M
>>> pyspark-shell
>>
>>

Mime
View raw message