Return-Path: X-Original-To: apmail-spark-issues-archive@minotaur.apache.org Delivered-To: apmail-spark-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E5C8817B6A for ; Sat, 16 May 2015 07:24:59 +0000 (UTC) Received: (qmail 62795 invoked by uid 500); 16 May 2015 07:24:59 -0000 Delivered-To: apmail-spark-issues-archive@spark.apache.org Received: (qmail 62761 invoked by uid 500); 16 May 2015 07:24:59 -0000 Mailing-List: contact issues-help@spark.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@spark.apache.org Received: (qmail 62751 invoked by uid 99); 16 May 2015 07:24:59 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 May 2015 07:24:59 +0000 Date: Sat, 16 May 2015 07:24:59 +0000 (UTC) From: "Tathagata Das (JIRA)" To: issues@spark.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (SPARK-7661) Support for dynamic allocation of executors in Kinesis Spark Streaming MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/SPARK-7661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14546617#comment-14546617 ] Tathagata Das commented on SPARK-7661: -------------------------------------- N+1 is used in the example, but isnt really the suggested recommended way. Here is how it works. You have to give X + Y cores, where X = number of Kinesis streams/receivers and Y = number of cores for processing the data. The X receivers will in collaboration with each other receive data from N shards. If you expect your N to vary from 10 to 20, then having X = 15 isnt a bad idea. At N = 20, the 15 receivers wil distribute the work among themselves. And Y should be such that your systems can process the data as fast as it is received. > Support for dynamic allocation of executors in Kinesis Spark Streaming > ---------------------------------------------------------------------- > > Key: SPARK-7661 > URL: https://issues.apache.org/jira/browse/SPARK-7661 > Project: Spark > Issue Type: New Feature > Components: Streaming > Affects Versions: 1.3.1 > Environment: AWS-EMR > Reporter: Murtaza Kanchwala > > Currently the no. of cores is (N + 1), where N is no. of shards in a Kinesis Stream. > My Requirement is that if I use this Resharding util for Amazon Kinesis : > Amazon Kinesis Resharding : https://github.com/awslabs/amazon-kinesis-scaling-utils > Then there should be some way to allocate executors on the basis of no. of shards directly (for Spark Streaming only). -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org For additional commands, e-mail: issues-help@spark.apache.org