spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mayur Rustagi <mayur.rust...@gmail.com>
Subject Re: Sharing SparkContext
Date Tue, 11 Mar 2014 01:51:02 GMT
Which version of Spark  are you using?


Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>



On Mon, Mar 10, 2014 at 6:49 PM, abhinav chowdary <
abhinav.chowdary@gmail.com> wrote:

> for any one who is interested to know about job server from Ooyala.. we
> started using it recently and been working great so far..
> On Feb 25, 2014 9:23 PM, "Ognen Duzlevski" <ognen@nengoiksvelzud.com>
> wrote:
>
>>  In that case, I must have misunderstood the following (from
>> http://spark.incubator.apache.org/docs/0.8.1/job-scheduling.html).
>> Apologies. Ognen
>>
>> "Inside a given Spark application (SparkContext instance), multiple
>> parallel jobs can run simultaneously if they were submitted from separate
>> threads. By “job”, in this section, we mean a Spark action (e.g. save,
>> collect) and any tasks that need to run to evaluate that action. Spark’s
>> scheduler is fully thread-safe and supports this use case to enable
>> applications that serve multiple requests (e.g. queries for multiple
>> users).
>>
>> By default, Spark’s scheduler runs jobs in FIFO fashion. Each job is
>> divided into “stages” (e.g. map and reduce phases), and the first job gets
>> priority on all available resources while its stages have tasks to launch,
>> then the second job gets priority, etc. If the jobs at the head of the
>> queue don’t need to use the whole cluster, later jobs can start to run
>> right away, but if the jobs at the head of the queue are large, then later
>> jobs may be delayed significantly.
>>
>> Starting in Spark 0.8, it is also possible to configure fair sharing
>> between jobs. Under fair sharing, Spark assigns tasks between jobs in a
>> “round robin” fashion, so that all jobs get a roughly equal share of
>> cluster resources. This means that short jobs submitted while a long job is
>> running can start receiving resources right away and still get good
>> response times, without waiting for the long job to finish. This mode is
>> best for multi-user settings.
>>
>> To enable the fair scheduler, simply set the spark.scheduler.mode to FAIR
>>  before creating a SparkContext:"
>> On 2/25/14, 12:30 PM, Mayur Rustagi wrote:
>>
>> fair scheduler merely reorders tasks .. I think he is looking to run
>> multiple pieces of code on a single context on demand from customers...if
>> the code & order is decided then fair scheduler will ensure that all tasks
>> get equal cluster time :)
>>
>>
>>
>>  Mayur Rustagi
>> Ph: +919632149971
>> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
>>  https://twitter.com/mayur_rustagi
>>
>>
>>
>> On Tue, Feb 25, 2014 at 10:24 AM, Ognen Duzlevski <
>> ognen@nengoiksvelzud.com> wrote:
>>
>>>  Doesn't the fair scheduler solve this?
>>> Ognen
>>>
>>>
>>> On 2/25/14, 12:08 PM, abhinav chowdary wrote:
>>>
>>> Sorry for not being clear earlier
>>> how do you want to pass the operations to the spark context?
>>> this is partly what i am looking for . How to access the active spark
>>> context and possible ways to pass operations
>>>
>>>  Thanks
>>>
>>>
>>>
>>>  On Tue, Feb 25, 2014 at 10:02 AM, Mayur Rustagi <
>>> mayur.rustagi@gmail.com> wrote:
>>>
>>>> how do you want to pass the operations to the spark context?
>>>>
>>>>
>>>>  Mayur Rustagi
>>>> Ph: +919632149971
>>>> h <https://twitter.com/mayur_rustagi>ttp://www.sigmoidanalytics.com
>>>>  https://twitter.com/mayur_rustagi
>>>>
>>>>
>>>>
>>>> On Tue, Feb 25, 2014 at 9:59 AM, abhinav chowdary <
>>>> abhinav.chowdary@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>        I am looking for ways to share the sparkContext, meaning i need
>>>>> to be able to perform multiple operations on the same spark context.
>>>>>
>>>>>  Below is code of a simple app i am testing
>>>>>
>>>>>   def main(args: Array[String]) {
>>>>>     println("Welcome to example application!")
>>>>>
>>>>>      val sc = new SparkContext("spark://10.128.228.142:7077", "Simple
>>>>> App")
>>>>>
>>>>>      println("Spark context created!")
>>>>>
>>>>>      println("Creating RDD!")
>>>>>
>>>>>  Now once this context is created i want to access  this to submit
>>>>> multiple jobs/operations
>>>>>
>>>>>  Any help is much appreciated
>>>>>
>>>>>  Thanks
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>>  --
>>> Warm Regards
>>> Abhinav Chowdary
>>>
>>>
>>>
>>
>>

Mime
View raw message