spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Chayapan Khannabha <chaya...@gmail.com>
Subject Re: spark job scheduling
Date Thu, 28 Jan 2016 03:50:49 GMT
I would start at this wiki page
https://spark.apache.org/docs/1.2.0/job-scheduling.html

Although I'm sure this depends a lot on your cluster environment and the
deployed Spark version.

IMHO

On Thu, Jan 28, 2016 at 10:27 AM, Niranda Perera <niranda.perera@gmail.com>
wrote:

> Sorry I have made typos. let me rephrase
>
> 1. As I understand, the smallest unit of work an executor can perform, is
> a 'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the
> spark ctx which has a considerable amount of work to do in a single task.
> While such a 'big' task is running, can we still submit another smaller job
> (from a separate thread) and get it done? or does that smaller job has to
> wait till the bigger task finishes and the resources are freed from the
> executor?
> (essentially, what I'm asking is, in the FAIR scheduler mode, jobs are
> scheduled fairly, but at the task granularity they are still FIFO?)
>
> 2. When a job is submitted without setting a scheduler pool, the 'default'
> scheduler pool is assigned to it, which employs FIFO scheduling. but what
> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
> without specifying a scheduler pool (which has FAIR scheduling)? would the
> jobs still run in FIFO mode with the default pool?
> essentially, for us to really set FAIR scheduling, do we have to assign a
> FAIR scheduler pool also to the job?
>
> On Thu, Jan 28, 2016 at 8:47 AM, Chayapan Khannabha <chayapan@gmail.com>
> wrote:
>
>> I think the smallest unit of work is a "Task", and an "Executor" is
>> responsible for getting the work done? Would like to understand more about
>> the scheduling system too. Scheduling strategy like FAIR or FIFO do have
>> significant impact on a Spark cluster architecture design decision.
>>
>> Best,
>>
>> Chayapan (A)
>>
>> On Thu, Jan 28, 2016 at 10:07 AM, Niranda Perera <
>> niranda.perera@gmail.com> wrote:
>>
>>> hi all,
>>>
>>> I have a few questions on spark job scheduling.
>>>
>>> 1. As I understand, the smallest unit of work an executor can perform.
>>> In the 'fair' scheduler mode, let's say  a job is submitted to the spark
>>> ctx which has a considerable amount of work to do in a task. While such a
>>> 'big' task is running, can we still submit another smaller job (from a
>>> separate thread) and get it done? or does that smaller job has to wait till
>>> the bigger task finishes and the resources are freed from the executor?
>>>
>>> 2. When a job is submitted without setting a scheduler pool, the default
>>> scheduler pool is assigned to it, which employs FIFO scheduling. but what
>>> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
>>> without specifying a scheduler pool (which has FAIR scheduling)? would the
>>> jobs still run in FIFO mode with the default pool?
>>> essentially, for us to really set FAIR scheduling, do we have to assign
>>> a FAIR scheduler pool?
>>>
>>> best
>>>
>>> --
>>> Niranda
>>> @n1r44 <https://twitter.com/N1R44>
>>> +94-71-554-8430
>>> https://pythagoreanscript.wordpress.com/
>>>
>>
>>
>
>
> --
> Niranda
> @n1r44 <https://twitter.com/N1R44>
> +94-71-554-8430
> https://pythagoreanscript.wordpress.com/
>

Mime
View raw message