Return-Path: X-Original-To: apmail-spark-dev-archive@minotaur.apache.org Delivered-To: apmail-spark-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8D84310025 for ; Sun, 9 Mar 2014 07:53:42 +0000 (UTC) Received: (qmail 83858 invoked by uid 500); 9 Mar 2014 07:53:42 -0000 Delivered-To: apmail-spark-dev-archive@spark.apache.org Received: (qmail 83290 invoked by uid 500); 9 Mar 2014 07:53:33 -0000 Mailing-List: contact dev-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@spark.apache.org Delivered-To: mailing list dev@spark.apache.org Received: (qmail 83065 invoked by uid 99); 9 Mar 2014 07:53:29 -0000 Received: from tyr.zones.apache.org (HELO tyr.zones.apache.org) (140.211.11.114) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 09 Mar 2014 07:53:29 +0000 Received: by tyr.zones.apache.org (Postfix, from userid 65534) id B8BB893B47B; Sun, 9 Mar 2014 07:53:26 +0000 (UTC) From: qqsun8819 To: dev@spark.apache.org Reply-To: dev@spark.apache.org Message-ID: Subject: [GitHub] spark pull request: SPARK-1099:Spark's local mode should probably ... Content-Type: text/plain Date: Sun, 9 Mar 2014 07:53:26 +0000 (UTC) GitHub user qqsun8819 opened a pull request: https://github.com/apache/spark/pull/110 SPARK-1099:Spark's local mode should probably respect spark.cores.max by default This is for JIRA:https://spark-project.atlassian.net/browse/SPARK-1099 And this is what I do in this patch (also commented in the JIRA) @aarondav This is really a behavioral change, so I do this with greate caution, adn welcome any review advice: 1 I change the "MASTER=local" pattern of create LocalBackEnd . In the past, we passed 1 core to it . now it use a default cores The reason here is that when someone use spark-shell to start local mode , Repl will use this "MASTER=local" pattern as default. So if one also specify cores in the spark-shell command line, it will all go in here. So here pass 1 core is not suitalbe reponding to our change here. 2 In the LocalBackEnd , the "totalCores" variable are fetched following a different rule(in the past it just take in a userd passed cores, like 1 in "MASTER=local" pattern, 2 in "MASTER=local2" pattern" rules: a The second argument of LocalBackEnd 's constructor indicating cores have a default value which is Int.MaxValue. If user didn't pass it , its first default value is Int.MaxValue b In getMaxCores, we first compare the former value to Int.MaxValue. if it's not equal, we think that user has passed their desired value, so just use it c. If b is not satified, we then get cores from spark.cores.max, and we get real logical cores from Runtime. And if cores specified by spark.cores.max is bigger than logical cores, we use logical cores, otherwise we use spark.cores.max 3 In SparkContextSchedulerCreationSuite 's test("local") case, assertion is modified from 1 to logical cores, because "MASTER=local" pattern use default vaules. You can merge this pull request into a Git repository by running: $ git pull https://github.com/qqsun8819/spark local-cores Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/110.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #110 ---- commit 6ae1ee82f49e10166c29c538f452503236d06531 Author: qqsun8819 Date: 2014-03-09T06:19:10Z Add a static function in LocalBackEnd to let it use spark.cores.max specified cores when no cores are passed to it commit 78b9c60ce8279189e486479fbb211410c1a1b73c Author: qqsun8819 Date: 2014-03-09T07:28:23Z 1 SparkContext MASTER=local pattern use default cores instead of 1 to construct LocalBackEnd , for use of spark-shell and cores specified in cmd line 2 some test case change from local to local[1]. 3 SparkContextSchedulerCreationSuite test spark.cores.max config in local pattern ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastructure@apache.org or file a JIRA ticket with INFRA. ---