Mailing-List: contact issues-help@spark.apache.org; run by ezmlm
Precedence: bulk
Date: Sat, 16 May 2015 07:24:59 +0000 (UTC)
From: "Tathagata Das (JIRA)" <jira@apache.org>
To: issues@spark.apache.org
Message-ID: <JIRA.12830086.1431671744000.134063.1431761099772@Atlassian.JIRA>
In-Reply-To: <JIRA.12830086.1431671744000@Atlassian.JIRA>
References: <JIRA.12830086.1431671744000@Atlassian.JIRA>
 <JIRA.12830086.1431671744513@arcas>
Subject: [jira] [Commented] (SPARK-7661) Support for dynamic allocation of
 executors in Kinesis Spark Streaming
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/SPARK-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546617#comment-14546617 ] 

Tathagata Das commented on SPARK-7661:
--------------------------------------

N+1 is used in the example, but isnt really the suggested recommended way. Here is how it works. You have to give X + Y cores, where X = number of Kinesis streams/receivers and Y = number of cores for processing the data. The X receivers will in collaboration with each other receive data from N shards. If you expect your N to vary from 10 to 20, then having X = 15 isnt a bad idea. At N = 20, the 15 receivers wil distribute the work among themselves. And Y should be such that your systems can process the data as fast as it is received. 


> Support for dynamic allocation of executors in Kinesis Spark Streaming
> ----------------------------------------------------------------------
>
>                 Key: SPARK-7661
>                 URL: https://issues.apache.org/jira/browse/SPARK-7661
>             Project: Spark
>          Issue Type: New Feature
>          Components: Streaming
>    Affects Versions: 1.3.1
>         Environment: AWS-EMR
>            Reporter: Murtaza Kanchwala
>
> Currently the no. of cores is (N + 1), where N is no. of shards in a Kinesis Stream.
> My Requirement is that if I use this Resharding util for Amazon Kinesis :
> Amazon Kinesis Resharding : https://github.com/awslabs/amazon-kinesis-scaling-utils
> Then there should be some way to allocate executors on the basis of no. of shards directly (for Spark Streaming only).


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org