Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm
Precedence: bulk
Reply-To: dev@cassandra.apache.org
Date: Tue, 28 Jun 2011 14:15:17 +0000 (UTC)
From: "Mck SembWever (JIRA)" <jira@apache.org>
To: commits@cassandra.apache.org
Message-ID: 
 <895135465.1792.1309270517825.JavaMail.tomcat@hel.zones.apache.org>
In-Reply-To: 
 <1440851143.12223.1301085665835.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Issue Comment Edited] (CASSANDRA-2388)
 ColumnFamilyRecordReader fails for a given split because a host is down,
 even if records could reasonably be read from other replica.
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056524#comment-13056524 ] 

Mck SembWever edited comment on CASSANDRA-2388 at 6/28/11 2:14 PM:
-------------------------------------------------------------------

bq. It looks like there's a ton of effort put in to avoiding making sortByProximity work w/ non-local nodes
Because it's only when that local node is down that we actually need to sort...
When/if DynamicEndpointSnitch's limitation is fixed (and it can sort by non-local nodes) then CassandraServer.java need not bypass it. But this won't simplify the code in CFRR. Now that CFIF supports multiple initialAddresses the method CFRR.sortEndpointsByProximity(..) can be rewritten (ie any connection to any initialAddress is all we need, no need to mess around with trying to connect through replica's to find information about replicas...)
bq. Wait, why do we even care? "local node" IS the right host to sort against
No. "initialAddress" is the right node to sort against. And it should be "local node". And then we don't care about the replica.
But when "initialAddress" is down, then we randomly connect to another c* node so to find out of the replica we know about which are 1) up, 2) closest, and 3) in the same dc. Then it is a random c* node that becomes the "local node" and the call needs to be {{snitch.sortByProximity(initialAddress, addresses)}}.
But yes... the CFRR code is contorted. In many ways i prefer the simplicity of the first patch (both in api and in implementation) despite it not being "as correct". i thought of this "fallback to replica" as a last resort to keep the m/r job running, rather than an actively used feature where DynamicEndpointSnitch's scores will maximise performance. But then i'm only thinking in terms of a small c* cluster and i certainly am naive about what performance gains these scores can give...

      was (Author: michaelsembwever):
    bq. It looks like there's a ton of effort put in to avoiding making sortByProximity work w/ non-local nodes
Because it's only when that local node is down that we actually need to sort...
When/if DynamicEndpointSnitch's limitation is fixed (and it can sort by non-local nodes) then CassandraServer.java need not bypass it. But this won't simplify the code in CFRR. Now that CFIF supports multiple initialAddresses the method sortEndpointsByProximity(..) in CFIF can be rewritten (ie any connection to any initialAddress is all we need, no need to mess around with trying to connect through replica's to find information about replicas...)
bq. Wait, why do we even care? "local node" IS the right host to sort against
No. "initialAddress" is the right node to sort against. And it should be "local node". And then we don't care about the replica.
But when "initialAddress" is down, then we randomly connect to another c* node so to find out of the replica we know about which are 1) up, 2) closest, and 3) in the same dc. Then it is a random c* node that becomes the "local node" and the call needs to be {{snitch.sortByProximity(initialAddress, addresses)}}.
But yes... the CFRR code is contorted. In many ways i prefer the simplicity of the first patch (both in api and in implementation) despite it not being "as correct". i thought of this "fallback to replica" as a last resort to keep the m/r job running, rather than an actively used feature where DynamicEndpointSnitch's scores will maximise performance. But then i'm only thinking in terms of a small c* cluster and i certainly am naive about what performance gains these scores can give...
  
> ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.7.6, 0.8.0
>            Reporter: Eldon Stegall
>            Assignee: Jeremy Hanna
>              Labels: hadoop, inputformat
>             Fix For: 0.7.7, 0.8.2
>
>         Attachments: 0002_On_TException_try_next_split.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We should try multiple locations for a given split.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira