Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6B0D2D68C for ; Thu, 22 Nov 2012 09:08:59 +0000 (UTC) Received: (qmail 50399 invoked by uid 500); 22 Nov 2012 09:08:59 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 50373 invoked by uid 500); 22 Nov 2012 09:08:59 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 50346 invoked by uid 99); 22 Nov 2012 09:08:58 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 22 Nov 2012 09:08:58 +0000 Date: Thu, 22 Nov 2012 09:08:58 +0000 (UTC) From: =?utf-8?Q?Piotr_Ko=C5=82aczkowski_=28JIRA=29?= To: commits@cassandra.apache.org Message-ID: <209583365.16307.1353575338401.JavaMail.jiratomcat@arcas> Subject: [jira] [Created] (CASSANDRA-4983) Improve range wrap-around in CFIF: CFIF shouldn't produce input splits of very tiny size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Piotr Ko=C5=82aczkowski created CASSANDRA-4983: --------------------------------------------- Summary: Improve range wrap-around in CFIF: CFIF shouldn't pro= duce input splits of very tiny size Key: CASSANDRA-4983 URL: https://issues.apache.org/jira/browse/CASSANDRA-4983 Project: Cassandra Issue Type: Improvement Affects Versions: 1.1.6 Reporter: Piotr Ko=C5=82aczkowski Assignee: Piotr Ko=C5=82aczkowski Priority: Minor Currently CFIF splits the wrap-around split into two non-wrap-around splits= . While it simplifies CFRR implementation, this approach has several minor = downsides: * One of the splits can be extremely small. One of our (picky) customers s= uspected there must be a bug, because one of his map tasks executed in 1 s= econd, while all the rest executed in minutes. Also having a very small tas= k is wasting resources - more resources go to launching the task than doing= any real work. * The number of map tasks is always one more than the number of (expected = rows / cassandra.input.split.size). The number of map tasks is always >=3D = 2. This is confusing customers.=20 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrato= rs For more information on JIRA, see: http://www.atlassian.com/software/jira