hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5746) add ShortCircuitSharedMemorySegment
Date Tue, 28 Jan 2014 23:31:20 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884911#comment-13884911
] 

Colin Patrick McCabe commented on HDFS-5746:
--------------------------------------------

bq. Can we verify the fix for racing sendCallback and toRemove? I think we need to check that
the fd being removed is in entries before doing sendCallback. firstEntry also doesn't remove
the entry from toRemove, so it looks like this inf loops. pollFirstEntry instead?

It doesn't infinitely loop, because sendCallback always removes the fd from toRemove.

I can't think of any practical way to test the scenario you outlined, with an event happening
on {{sendCallback}} racing with the same fd added to {{toRemove}} .  Maybe a stress test would
hit it.

bq. Maybe remove() should also return a boolean "success" value too, rather than just swallowing
an unknown socket.

It's not needed because if we try to remove something that doesn't exist, we hit a precondition
check.

bq. Should doc that we only support one Handler per fd, it overwrites on add.

added this comment

bq. Can add a Precondition check to make sure the lock is held in checkNotClosed

added

bq. Flag constants would be more readable as "1<<63" and "1<<62" rather than 15
zeroes (I did verify though )

ok

bq. Comment in Slot constructor talks about incrementing a refcount, but that's no longer
happening there.  No need to throw IOException in Slot constructor

fixed

bq. Terminology: it seems like the "anchorable" flag means "is mlocked by DN and can increment
the refcount" and "anchor" is a refcount for "using mlocked data"

I like the current terminology.  "lockable" just sounds vague-- especially because we already
have an operation which is (m)locking the block on the datanode, so it gets confusing to use
the same term for what the client is doing.

bq. How do we communicate the slot index between the DN and client? I see we keep the slot
address, but what we need to pass to the client is an index. Maybe this is coming.

the DN will have to pass the slot index as part of the response to the REQUEST_SHORT_CIRCUIT_FDS
operation.  It will also pass the shared memory segment itself as part of that operation :)
 actually, it's a bit more complex than that... if there is an outstanding shm segment, the
DN will try to reuse it-- otherwise it will create a new one.  But since all the slots are
the same size and interchangeable, the allocation is not that complex.

> add ShortCircuitSharedMemorySegment
> -----------------------------------
>
>                 Key: HDFS-5746
>                 URL: https://issues.apache.org/jira/browse/HDFS-5746
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode, hdfs-client
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5746.001.patch, HDFS-5746.002.patch, HDFS-5746.003.patch
>
>
> Add ShortCircuitSharedMemorySegment, which will be used to communicate information between
the datanode and the client about whether a replica is mlocked.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message