cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Scott Fines (JIRA)" <>
Subject [jira] [Updated] (CASSANDRA-4886) Remote ColumnFamilyInputFormat
Date Wed, 31 Oct 2012 18:32:12 GMT


Scott Fines updated CASSANDRA-4886:

    Attachment:     (was: CASSANDRA-4886.path)
> Remote ColumnFamilyInputFormat
> ------------------------------
>                 Key: CASSANDRA-4886
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>    Affects Versions: 1.1.6
>            Reporter: Scott Fines
>             Fix For: 1.1.6
>         Attachments: CASSANDRA-4886.patch
> As written, the ColumnFamilyInputFormat does not have a great deal of fault tolerance.

> It only attempts to perform a read from a single replica, with an infinite timeout. If
that replica is not available, then the Task fails, and must be retried on a different node.
> This is fine if the TaskTrackers are colocated with Cassandra nodes, but is very fragile
when this is not possible. When the Tasktrackers are remote to cassandra, the same rules about
clients should apply--there should be a strict (configurable) timeout, and the ability to
retry requests on a different replica if at single request fails. 
> It seems obvious that we'd want to support both types of architecture; to do that, we
should probably have a configuration which allows the user to specify his architecture choices

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message