spark-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Niranda Perera <niranda.per...@gmail.com>
Subject Re: spark job scheduling
Date Thu, 28 Jan 2016 03:27:22 GMT
Sorry I have made typos. let me rephrase

1. As I understand, the smallest unit of work an executor can perform, is a
'task'. In the 'FAIR' scheduler mode, let's say a job is submitted to the
spark ctx which has a considerable amount of work to do in a single task.
While such a 'big' task is running, can we still submit another smaller job
(from a separate thread) and get it done? or does that smaller job has to
wait till the bigger task finishes and the resources are freed from the
executor?
(essentially, what I'm asking is, in the FAIR scheduler mode, jobs are
scheduled fairly, but at the task granularity they are still FIFO?)

2. When a job is submitted without setting a scheduler pool, the 'default'
scheduler pool is assigned to it, which employs FIFO scheduling. but what
happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
without specifying a scheduler pool (which has FAIR scheduling)? would the
jobs still run in FIFO mode with the default pool?
essentially, for us to really set FAIR scheduling, do we have to assign a
FAIR scheduler pool also to the job?

On Thu, Jan 28, 2016 at 8:47 AM, Chayapan Khannabha <chayapan@gmail.com>
wrote:

> I think the smallest unit of work is a "Task", and an "Executor" is
> responsible for getting the work done? Would like to understand more about
> the scheduling system too. Scheduling strategy like FAIR or FIFO do have
> significant impact on a Spark cluster architecture design decision.
>
> Best,
>
> Chayapan (A)
>
> On Thu, Jan 28, 2016 at 10:07 AM, Niranda Perera <niranda.perera@gmail.com
> > wrote:
>
>> hi all,
>>
>> I have a few questions on spark job scheduling.
>>
>> 1. As I understand, the smallest unit of work an executor can perform. In
>> the 'fair' scheduler mode, let's say  a job is submitted to the spark ctx
>> which has a considerable amount of work to do in a task. While such a 'big'
>> task is running, can we still submit another smaller job (from a separate
>> thread) and get it done? or does that smaller job has to wait till the
>> bigger task finishes and the resources are freed from the executor?
>>
>> 2. When a job is submitted without setting a scheduler pool, the default
>> scheduler pool is assigned to it, which employs FIFO scheduling. but what
>> happens when we have the spark.scheduler.mode as FAIR, and if I submit jobs
>> without specifying a scheduler pool (which has FAIR scheduling)? would the
>> jobs still run in FIFO mode with the default pool?
>> essentially, for us to really set FAIR scheduling, do we have to assign a
>> FAIR scheduler pool?
>>
>> best
>>
>> --
>> Niranda
>> @n1r44 <https://twitter.com/N1R44>
>> +94-71-554-8430
>> https://pythagoreanscript.wordpress.com/
>>
>
>


-- 
Niranda
@n1r44 <https://twitter.com/N1R44>
+94-71-554-8430
https://pythagoreanscript.wordpress.com/

Mime
View raw message