hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5182) BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid
Date Tue, 03 Dec 2013 21:32:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13838215#comment-13838215

Todd Lipcon commented on HDFS-5182:

The other way (let's call this choice #2) is for the client to keep open the Domain socket
it used to request the two file descriptors. If we can listen for messages sent on this socket,
we can have a truly edge-triggered notification method. The messages can be as short as a
single byte, since we have very simple message needs. This requires adding an epoll loop to
handle these notifications without consuming a whole thread per socket.

Can you explain further what you mean by "edge-triggered notification" here? Do you mean that
the DN can detect when the client closes the block, because it'll get an EOF on the socket?
Or do you mean that, when the client performs a read, it'll write a byte to the socket indicating
"hey, I'm still here using this block"? Or that the DN will send a notification to the client
saying "hey, I'd like to revoke this block from the cache, please stop zero-copying it"?

It's also worth considering that some clients (eg HBase) tend to open all of their blocks
at startup, and never close/reopen files except on errors. It would still be nice if our caching
mechanism could transition between zero-copy and one-copy if we want to migrate those files
in and out of cache while the client keeps them open. (eg we know that we're about to run
an intensive job on some HBase table, so we ask HDFS to cache its blocks for an hour or two
in the middle of the night, and then drop it back out of cache after the batch job is done)

> BlockReaderLocal must allow zero-copy  reads only when the DN believes it's valid
> ---------------------------------------------------------------------------------
>                 Key: HDFS-5182
>                 URL: https://issues.apache.org/jira/browse/HDFS-5182
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Colin Patrick McCabe
>            Assignee: Colin Patrick McCabe
> BlockReaderLocal must allow zero-copy reads only when the DN believes it's valid.  This
implies adding a new field to the response to REQUEST_SHORT_CIRCUIT_FDS.  We also need some
kind of heartbeat from the client to the DN, so that the DN can inform the client when the
mapped region is no longer locked into memory.

This message was sent by Atlassian JIRA

View raw message