Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@spark.apache.org
Date: Sun, 22 Jun 2014 09:40:24 +0000 (UTC)
From: "Mridul Muralidharan (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.12720267.1402372353172.24741.1403430024313@arcas>
In-Reply-To: <JIRA.12720267.1402372353172@arcas>
References: <JIRA.12720267.1402372353172@arcas>
Subject: [jira] [Commented] (SPARK-2089) With YARN,
 preferredNodeLocalityData isn't honored
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040085#comment-14040085 ] 

Mridul Muralidharan commented on SPARK-2089:
--------------------------------------------


A few of things to be kept in mind (to be seen in context of current spark capabilities) :


1) SplitInfo is not used to specify locality of tasks, blocks, etc - it is used to express preference of which nodes to acquire in a cluster.
This in turn has an affect on task and block locality based on how spark schedules tasks - it is a second order effect.

So relevant in a shared (and usually large) cluster - particularly with distributed storage (dfs, hbase, etc) which exposes locality information.
In these cases, the benefit of using SplitInfo is tremendous when working with non trivial amounts of data.
For anything else, not using it should be fine and probably does not benefit much by using.


2) This is not for consumption in spark - but by the cluster scheduler spi implementation to infer node preference to allocate.
Which means adequate information required by the cluster scheduler should be available from this Map (or whatever we change it to).
The use of information in the Map is actually at the bare minimum currently.
Ideally we should leverage split sizes and input format - weight which splits are more 'expensive' to transfer based on the these (from dfs or from hbase or is dfs block cached or ...) to formulate the requests to RM - I left these for future enhancements due to time constraints.
Probably [~sandyr] or [~tgraves] might be interested in these in future.

Currently, we (as in, my team) also have user code which uses this; but that could potentially be rewritten at some expense in case spark shies away from exposing this.


3) The artifacts in spark's SplitInfo are not hadoop specific : it does expose information required by yarn - but not in a hadoop or yarn specific way.
hostLocation: String, path: String, length: Long, inputFormatClazz: Class - these are of generic use and should most likely be required for any other non-yarn use of SplitInfo too.
I cant comment authoritatively though since I dont have information on what these other uses could be.
underlyingSplit: Any is meant for more advanced use - to be used only when cluster scheduler knows what the underlying split might be.


Ofcourse, the SplitInfo object does do hadoop specific things - we can think about refactoring that out if required !
Not sure if that was the concern ?


> With YARN, preferredNodeLocalityData isn't honored 
> ---------------------------------------------------
>
>                 Key: SPARK-2089
>                 URL: https://issues.apache.org/jira/browse/SPARK-2089
>             Project: Spark
>          Issue Type: Bug
>          Components: YARN
>    Affects Versions: 1.0.0
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>            Priority: Critical
>
> When running in YARN cluster mode, apps can pass preferred locality data when constructing a Spark context that will dictate where to request executor containers.
> This is currently broken because of a race condition.  The Spark-YARN code runs the user class and waits for it to start up a SparkContext.  During its initialization, the SparkContext will create a YarnClusterScheduler, which notifies a monitor in the Spark-YARN code that .  The Spark-Yarn code then immediately fetches the preferredNodeLocationData from the SparkContext and uses it to start requesting containers.
> But in the SparkContext constructor that takes the preferredNodeLocationData, setting preferredNodeLocationData comes after the rest of the initialization, so, if the Spark-YARN code comes around quickly enough after being notified, the data that's fetched is the empty unset version.  The occurred during all of my runs.


--
This message was sent by Atlassian JIRA
(v6.2#6252)