Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Sun, 12 Oct 2014 21:48:34 +0000 (UTC)
From: "mck (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: <JIRA.12670357.1380056792000.250418.1413150514221@Atlassian.JIRA>
In-Reply-To: <JIRA.12670357.1380056792000@Atlassian.JIRA>
References: <JIRA.12670357.1380056792000@Atlassian.JIRA>
 <JIRA.12670357.1380056792768@arcas>
Subject: [jira] [Comment Edited] (CASSANDRA-6091) Better Vnode support in
 hadoop/pig
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-6091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14168561#comment-14168561 ] 

mck edited comment on CASSANDRA-6091 at 10/12/14 9:48 PM:
----------------------------------------------------------

I guess this^ approach falls apart once with increasing number of nodes in a cluster (the chances of adjacent splits with same dataNodes drops quickly), and it comes back to needing splits with multiple token ranges and CRR supporting that. 
But i still don't get why you *need* to have any thrift/CQL server-side change (at least to begin with)? 

For example this [patch|https://github.com/michaelsembwever/cassandra/pull/2/files] 
    (note i'm intentionally waiting on feedback before tackling CFIF+CFRR).


was (Author: michaelsembwever):
I guess this^ approach falls apart once with increasing number of nodes in a cluster (the chances of adjacent splits with same dataNodes drops quickly), and it comes back to splits with multiple token ranges and CRR supporting that. 
But i still don't get why you *need* to have any thrift/CQL server-side change (at least to begin with)? 

> Better Vnode support in hadoop/pig
> ----------------------------------
>
>                 Key: CASSANDRA-6091
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-6091
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>            Reporter: Alex Liu
>            Assignee: Alex Liu
>
> CASSANDRA-6084 shows there are some issues during running hadoop/pig job if vnodes are enable. Also the hadoop performance of vnode enabled nodes  are bad for there are so many splits.
> The idea is to combine vnode splits into a big sudo splits so it work like vnode is disable for hadoop/pig job


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)