hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Allen Wittenauer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-972) Allow requests and scheduling for fractional virtual cores
Date Wed, 31 Jul 2013 18:13:48 GMT

    [ https://issues.apache.org/jira/browse/YARN-972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725528#comment-13725528
] 

Allen Wittenauer commented on YARN-972:
---------------------------------------

bq.  Nodes in probably the majority of clusters are configured with more slots than cores.
This is sensible because many types of task do a lot of IO and do not even saturate half of
a single core. 

I disagree. It isn't a sensible thing to do at all unless it *also* schedules based upon IO
characteristics in addition to processor needs.  The system eventually ends up in a death
spiral:

P1: "We need more processes on this machine because the load isn't high!"

P2: "OK!  I've put more of our IO intensive processes on this machine!"

P1: "Weird!  The CPUs are now spending more time in IO wait!  Let's add more processes since
we have more CPU to get it higher!"

...

I posit that the reason why (at least in Hadoop 1.x systems) there are more tasks per cores
is simple: the jobs are crap.  They are spending more time launching JVMs and getting scheduled
than they are actually executing code. It gives the illusion that Hadoop isn't scheduling
efficiently. Unless one recognizes that there is a tipping point in parallelism, most users
are going to keep increasing it in blind faith that "more tasks = faster always". 


Also, yes, I want YARN-796, but I don't think that's an orthogonal discussion.  My opinion
is that they are different facets of the same discussion: how do we properly schedule in a
mixed load environment.  It's very hard to get it 100% efficient for all cases.  Some folks
are going to have to suffer.  If I had to pick, let it be the folks with workloads that are
either terribly written or sleep a lot and don't require a lot of processor when they do wake
up.
                
> Allow requests and scheduling for fractional virtual cores
> ----------------------------------------------------------
>
>                 Key: YARN-972
>                 URL: https://issues.apache.org/jira/browse/YARN-972
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: api, scheduler
>    Affects Versions: 2.0.5-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> As this idea sparked a fair amount of discussion on YARN-2, I'd like to go deeper into
the reasoning.
> Currently the virtual core abstraction hides two orthogonal goals.  The first is that
a cluster might have heterogeneous hardware and that the processing power of different makes
of cores can vary wildly.  The second is that a different (combinations of) workloads can
require different levels of granularity.  E.g. one admin might want every task on their cluster
to use at least a core, while another might want applications to be able to request quarters
of cores.  The former would configure a single vcore per core.  The latter would configure
four vcores per core.
> I don't think that the abstraction is a good way of handling the second goal.  Having
a virtual cores refer to different magnitudes of processing power on different clusters will
make the difficult problem of deciding how many cores to request for a job even more confusing.
> Can we not handle this with dynamic oversubscription?
> Dynamic oversubscription, i.e. adjusting the number of cores offered by a machine based
on measured CPU-consumption, should work as a complement to fine-granularity scheduling. 
Dynamic oversubscription is never going to be perfect, as the amount of CPU a process consumes
can vary widely over its lifetime.  A task that first loads a bunch of data over the network
and then performs complex computations on it will suffer if additional CPU-heavy tasks are
scheduled on the same node because its initial CPU-utilization was low.  To guard against
this, we will need to be conservative with how we dynamically oversubscribe.  If a user wants
to explicitly hint to the scheduler that their task will not use much CPU, the scheduler should
be able to take this into account.
> On YARN-2, there are concerns that including floating point arithmetic in the scheduler
will slow it down.  I question this assumption, and it is perhaps worth debating, but I think
we can sidestep the issue by multiplying CPU-quantities inside the scheduler by a decently
sized number like 1000 and keep doing the computations on integers.
> The relevant APIs are marked as evolving, so there's no need for the change to delay
2.1.0-beta.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message