hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Clampffer (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9103) Retry reads on DN failure
Date Wed, 04 Nov 2015 20:49:27 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14990381#comment-14990381

James Clampffer commented on HDFS-9103:

"Every node in the set requires a separate heap allocation. They might scatter around the
address space. vector is on the heap as well but it guarantees continuous amount of memory.
Some implementation of string has inlined buffer which has much better cache locality results."
I agree with both of those statements.  Do datanode IDs have a max length, is it just uuid?
 SSO is usually limited to 23 chars on 64 bit machines.  Intel processors have 8-way associative
caches for now so I'm not terribly worried about address space fragmentation.  The processor
has to try a little harder because it's not a simple linear prefetch to scoop up the vector
anymore, but superscalar pipelines have multiple load units :)

I think I might try out a more architectural fix to side-step this whole problem(why I didn't
get a patch up yet).  How about passing a function "IsDead(const std::string& dn)" through
the InputStream down to the block reader.  My current approach of generating a new set or
vector of bad nodes on every calls is terribly inefficient.  Even if SSO kicked in and it
boiled down to a memcpy there's still a smallish heap allocation for every GetNodesToExclude
call.  Passing down a function avoids keeping the redundant copies in cache to begin with.
 I'd change BadDataNodeTracker::bad_datanodes_ to a map (this is what it always should have
been, not sure why I thought a set of pairs keyed by pair::first was a good idea...).  The
IsDead function would grab the update lock which is usually implemented as a CAS in userspace,
and do a O(log(n)) map lookup.  In my experience the log2(smallish number) indirections with
std::map that lookup shouldn't come close to bottlenecking anything.  Do you see any obvious
issues with this approach?

"Checking with the code kResourceUnavailable is only for the NN cannot find any DNs to serve
this data. I don't think we'll need to handle this case when excluding the DNs."
Thanks for the info.  I was hoping this was the case but wasn't sure if I was missing something
that would be added soon.

> Retry reads on DN failure
> -------------------------
>                 Key: HDFS-9103
>                 URL: https://issues.apache.org/jira/browse/HDFS-9103
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: Bob Hansen
>            Assignee: James Clampffer
>             Fix For: HDFS-8707
>         Attachments: HDFS-9103.1.patch, HDFS-9103.2.patch, HDFS-9103.HDFS-8707.006.patch,
HDFS-9103.HDFS-8707.3.patch, HDFS-9103.HDFS-8707.4.patch, HDFS-9103.HDFS-8707.5.patch
> When AsyncPreadSome fails, add the failed DataNode to the excluded list and try again.

This message was sent by Atlassian JIRA

View raw message