spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jakob Odersky <ja...@odersky.com>
Subject Re: spark job scheduling
Date Thu, 28 Jan 2016 04:48:58 GMT
Nitpick: the up-to-date version of said wiki page is
https://spark.apache.org/docs/1.6.0/job-scheduling.html (not sure how
much it changed though)

On Wed, Jan 27, 2016 at 7:50 PM, Chayapan Khannabha <chayapan@gmail.com> wrote:
> I would start at this wiki page
> https://spark.apache.org/docs/1.2.0/job-scheduling.html
>
> Although I'm sure this depends a lot on your cluster environment and the
> deployed Spark version.
>
> IMHO
>
> On Thu, Jan 28, 2016 at 10:27 AM, Niranda Perera <niranda.perera@gmail.com>
> wrote:
>>
>> Sorry I have made typos. let me rephrase
>>
>> 1. As I understand, the smallest unit of work an executor can perform, is
>> a 'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the
>> spark ctx which has a considerable amount of work to do in a single task.
>> While such a 'big' task is running, can we still submit another smaller job
>> (from a separate thread) and get it done? or does that smaller job has to
>> wait till the bigger task finishes and the resources are freed from the
>> executor?
>> (essentially, what I'm asking is, in the FAIR scheduler mode, jobs are
>> scheduled fairly, but at the task granularity they are still FIFO?)
>>
>> 2. When a job is submitted without setting a scheduler pool, the 'default'
>> scheduler pool is assigned to it, which employs FIFO scheduling. but what
>> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
>> without specifying a scheduler pool (which has FAIR scheduling)? would the
>> jobs still run in FIFO mode with the default pool?
>> essentially, for us to really set FAIR scheduling, do we have to assign a
>> FAIR scheduler pool also to the job?
>>
>> On Thu, Jan 28, 2016 at 8:47 AM, Chayapan Khannabha <chayapan@gmail.com>
>> wrote:
>>>
>>> I think the smallest unit of work is a "Task", and an "Executor" is
>>> responsible for getting the work done? Would like to understand more about
>>> the scheduling system too. Scheduling strategy like FAIR or FIFO do have
>>> significant impact on a Spark cluster architecture design decision.
>>>
>>> Best,
>>>
>>> Chayapan (A)
>>>
>>> On Thu, Jan 28, 2016 at 10:07 AM, Niranda Perera
>>> <niranda.perera@gmail.com> wrote:
>>>>
>>>> hi all,
>>>>
>>>> I have a few questions on spark job scheduling.
>>>>
>>>> 1. As I understand, the smallest unit of work an executor can perform.
>>>> In the 'fair' scheduler mode, let's say  a job is submitted to the spark
ctx
>>>> which has a considerable amount of work to do in a task. While such a 'big'
>>>> task is running, can we still submit another smaller job (from a separate
>>>> thread) and get it done? or does that smaller job has to wait till the
>>>> bigger task finishes and the resources are freed from the executor?
>>>>
>>>> 2. When a job is submitted without setting a scheduler pool, the default
>>>> scheduler pool is assigned to it, which employs FIFO scheduling. but what
>>>> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
>>>> without specifying a scheduler pool (which has FAIR scheduling)? would the
>>>> jobs still run in FIFO mode with the default pool?
>>>> essentially, for us to really set FAIR scheduling, do we have to assign
>>>> a FAIR scheduler pool?
>>>>
>>>> best
>>>>
>>>> --
>>>> Niranda
>>>> @n1r44
>>>> +94-71-554-8430
>>>> https://pythagoreanscript.wordpress.com/
>>>
>>>
>>
>>
>>
>> --
>> Niranda
>> @n1r44
>> +94-71-554-8430
>> https://pythagoreanscript.wordpress.com/
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org


Mime
View raw message