Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4618B6C3A for ; Tue, 28 Jun 2011 14:15:42 +0000 (UTC) Received: (qmail 71404 invoked by uid 500); 28 Jun 2011 14:15:42 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 71278 invoked by uid 500); 28 Jun 2011 14:15:41 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 71149 invoked by uid 99); 28 Jun 2011 14:15:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jun 2011 14:15:41 +0000 X-ASF-Spam-Status: No, hits=-1996.4 required=5.0 tests=ALL_TRUSTED,FS_REPLICA,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 28 Jun 2011 14:15:38 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id CA45D4366DE for ; Tue, 28 Jun 2011 14:15:17 +0000 (UTC) Date: Tue, 28 Jun 2011 14:15:17 +0000 (UTC) From: "Mck SembWever (JIRA)" To: commits@cassandra.apache.org Message-ID: <895135465.1792.1309270517825.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1440851143.12223.1301085665835.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Issue Comment Edited] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056524#comment-13056524 ] Mck SembWever edited comment on CASSANDRA-2388 at 6/28/11 2:14 PM: ------------------------------------------------------------------- bq. It looks like there's a ton of effort put in to avoiding making sortByProximity work w/ non-local nodes Because it's only when that local node is down that we actually need to sort... When/if DynamicEndpointSnitch's limitation is fixed (and it can sort by non-local nodes) then CassandraServer.java need not bypass it. But this won't simplify the code in CFRR. Now that CFIF supports multiple initialAddresses the method CFRR.sortEndpointsByProximity(..) can be rewritten (ie any connection to any initialAddress is all we need, no need to mess around with trying to connect through replica's to find information about replicas...) bq. Wait, why do we even care? "local node" IS the right host to sort against No. "initialAddress" is the right node to sort against. And it should be "local node". And then we don't care about the replica. But when "initialAddress" is down, then we randomly connect to another c* node so to find out of the replica we know about which are 1) up, 2) closest, and 3) in the same dc. Then it is a random c* node that becomes the "local node" and the call needs to be {{snitch.sortByProximity(initialAddress, addresses)}}. But yes... the CFRR code is contorted. In many ways i prefer the simplicity of the first patch (both in api and in implementation) despite it not being "as correct". i thought of this "fallback to replica" as a last resort to keep the m/r job running, rather than an actively used feature where DynamicEndpointSnitch's scores will maximise performance. But then i'm only thinking in terms of a small c* cluster and i certainly am naive about what performance gains these scores can give... was (Author: michaelsembwever): bq. It looks like there's a ton of effort put in to avoiding making sortByProximity work w/ non-local nodes Because it's only when that local node is down that we actually need to sort... When/if DynamicEndpointSnitch's limitation is fixed (and it can sort by non-local nodes) then CassandraServer.java need not bypass it. But this won't simplify the code in CFRR. Now that CFIF supports multiple initialAddresses the method sortEndpointsByProximity(..) in CFIF can be rewritten (ie any connection to any initialAddress is all we need, no need to mess around with trying to connect through replica's to find information about replicas...) bq. Wait, why do we even care? "local node" IS the right host to sort against No. "initialAddress" is the right node to sort against. And it should be "local node". And then we don't care about the replica. But when "initialAddress" is down, then we randomly connect to another c* node so to find out of the replica we know about which are 1) up, 2) closest, and 3) in the same dc. Then it is a random c* node that becomes the "local node" and the call needs to be {{snitch.sortByProximity(initialAddress, addresses)}}. But yes... the CFRR code is contorted. In many ways i prefer the simplicity of the first patch (both in api and in implementation) despite it not being "as correct". i thought of this "fallback to replica" as a last resort to keep the m/r job running, rather than an actively used feature where DynamicEndpointSnitch's scores will maximise performance. But then i'm only thinking in terms of a small c* cluster and i certainly am naive about what performance gains these scores can give... > ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica. > ------------------------------------------------------------------------------------------------------------------------------------- > > Key: CASSANDRA-2388 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2388 > Project: Cassandra > Issue Type: Bug > Components: Hadoop > Affects Versions: 0.7.6, 0.8.0 > Reporter: Eldon Stegall > Assignee: Jeremy Hanna > Labels: hadoop, inputformat > Fix For: 0.7.7, 0.8.2 > > Attachments: 0002_On_TException_try_next_split.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch > > > ColumnFamilyRecordReader only tries the first location for a given split. We should try multiple locations for a given split. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira