cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mck SembWever (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-2388) ColumnFamilyRecordReader fails for a given split because a host is down, even if records could reasonably be read from other replica.
Date Sat, 18 May 2013 21:51:17 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-2388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13661436#comment-13661436
] 

Mck SembWever commented on CASSANDRA-2388:
------------------------------------------

Jonathan,
 I can't say i'm in favour of enforcing data locality.
Because  data locality in hadoop doesn't work this way… when a tasktracker through the next
heartbeat announces that it has a task slot free the jobtracker will do its best to assign
a task with data locality to it but failing this will assign it a random task. the number
of these random tasks can be quite high, just like i mentioned above
{quote} Tasks are still being evenly distributed around the ring regardless of what the ColumnFamilySplit.locations
is. {quote}

This can be almost solved by upgrading to hadoop-0.21+, using the fair scheduler and setting
the property {code}<property>
        <name>mapred.fairscheduler.locality.delay</name>
        <value>360000000</value>
<property>{code}.

At the end of the day while hadoop encourages data locality it does not enforce it.
The ideal approach would be to sort all locations by proximity.
The feasible approach hopefully is still [~tjake]'s above. In addition i'd be in favour of
a setting in the job's configuration as to whether a location from another datacenter can
be used.

references:
 - http://www.infoq.com/articles/HadoopInputFormat
 - http://www.mentby.com/matei-zaharia/running-only-node-local-jobs.html
 - https://groups.google.com/a/cloudera.org/forum/?fromgroups#!topic/cdh-user/3ggnE5hV0PY
 - http://www.cs.berkeley.edu/~matei/papers/2010/eurosys_delay_scheduling.pdf
                
> ColumnFamilyRecordReader fails for a given split because a host is down, even if records
could reasonably be read from other replica.
> -------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-2388
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2388
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 0.6
>            Reporter: Eldon Stegall
>            Assignee: Mck SembWever
>            Priority: Minor
>              Labels: hadoop, inputformat
>             Fix For: 1.2.5
>
>         Attachments: 0002_On_TException_try_next_split.patch, CASSANDRA-2388-addition1.patch,
CASSANDRA-2388-extended.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch, CASSANDRA-2388.patch,
CASSANDRA-2388.patch
>
>
> ColumnFamilyRecordReader only tries the first location for a given split. We should try
multiple locations for a given split.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message