Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C77E717D24 for ; Thu, 9 Oct 2014 14:15:39 +0000 (UTC) Received: (qmail 78435 invoked by uid 500); 9 Oct 2014 14:15:34 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 78395 invoked by uid 500); 9 Oct 2014 14:15:34 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 78383 invoked by uid 99); 9 Oct 2014 14:15:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Oct 2014 14:15:34 +0000 Date: Thu, 9 Oct 2014 14:15:34 +0000 (UTC) From: "Darla Baker (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (CASSANDRA-7280) Hadoop support not respecting cassandra.input.split.size MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-7280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14165161#comment-14165161 ] Darla Baker commented on CASSANDRA-7280: ---------------------------------------- Any chance this will be assigned soon? > Hadoop support not respecting cassandra.input.split.size > -------------------------------------------------------- > > Key: CASSANDRA-7280 > URL: https://issues.apache.org/jira/browse/CASSANDRA-7280 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Reporter: Jeremy Hanna > > Long ago (0.7), I tried to set the cassandra.input.split.size property and never really got it to respect that property. However the batch size was useful for what I needed to affect the timeouts. > Now with the cql record reader and the native paging, users can specify queries potentially using allow filtering clauses. The input split size is more important because the server may have to scan through many many records to get matching records. If the user can effectively set the input split size, then that gives a hard limit on how many records it will traverse. > Currently it appears to be overriding the property, perhaps in the client.describe_splits_ex method on the server side. > It can be argued that users shouldn't be using allow filtering clauses in their cql in the first place. However it is still a bug that the input split size is not honored. -- This message was sent by Atlassian JIRA (v6.3.4#6332)