Return-Path: X-Original-To: apmail-cassandra-commits-archive@www.apache.org Delivered-To: apmail-cassandra-commits-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 23B39DD16 for ; Thu, 14 Mar 2013 16:56:16 +0000 (UTC) Received: (qmail 85381 invoked by uid 500); 14 Mar 2013 16:56:13 -0000 Delivered-To: apmail-cassandra-commits-archive@cassandra.apache.org Received: (qmail 85353 invoked by uid 500); 14 Mar 2013 16:56:13 -0000 Mailing-List: contact commits-help@cassandra.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@cassandra.apache.org Delivered-To: mailing list commits@cassandra.apache.org Received: (qmail 85169 invoked by uid 99); 14 Mar 2013 16:56:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 14 Mar 2013 16:56:13 +0000 Date: Thu, 14 Mar 2013 16:56:13 +0000 (UTC) From: "Jonathan Ellis (JIRA)" To: commits@cassandra.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CASSANDRA-4886) Remote ColumnFamilyInputFormat MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/CASSANDRA-4886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-4886: -------------------------------------- Affects Version/s: (was: 1.1.6) Fix Version/s: (was: 1.1.6) 2.0 > Remote ColumnFamilyInputFormat > ------------------------------ > > Key: CASSANDRA-4886 > URL: https://issues.apache.org/jira/browse/CASSANDRA-4886 > Project: Cassandra > Issue Type: Improvement > Components: Hadoop > Reporter: Scott Fines > Fix For: 2.0 > > Attachments: CASSANDRA-4886.patch > > > As written, the ColumnFamilyInputFormat does not have a great deal of fault tolerance. > It only attempts to perform a read from a single replica, with an infinite timeout. If that replica is not available, then the Task fails, and must be retried on a different node. > This is fine if the TaskTrackers are colocated with Cassandra nodes, but is very fragile when this is not possible. When the Tasktrackers are remote to cassandra, the same rules about clients should apply--there should be a strict (configurable) timeout, and the ability to retry requests on a different replica if at single request fails. > It seems obvious that we'd want to support both types of architecture; to do that, we should probably have a configuration which allows the user to specify his architecture choices explicitely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira