hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bob Hansen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-9643) libhdfs++: Support async cancellation of read operations
Date Fri, 15 Jan 2016 18:28:39 GMT

    [ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102252#comment-15102252

Bob Hansen commented on HDFS-9643:

Thanks for jumping into that [~James Clampffer].  It looks like a very good start.

Rather than having a lock in the cancellation handle, we should be able to get away with a

bq. -Right now the cancel logic is added directly to each continuation in the remote block
reader. On one hand this is simple and works, on the other it's boilerplate code. Is this
worth pushing into the continuation pipeline code at the moment? I think it's worth keeping
it simple until NN operations become cancelable.
The pipeline class already has a concept of annihilating error handling - if !status.ok(),
skip over the rest of the pipeline and deliver an error.  That would seem to be an opportune
place to put in a cancellation check.

-In this implementation FileHandle::CancelOperations is irreversible and prevents it from
being used again. Can anyone think of a reason not to have it also close the file or at least
clear vector<LocatedBlockProto>?
I think we need to have it close the file so that any outstanding async requests get their
callback, and can respond to the cancellation.  Having well-defined behavior after cancellation
for stateful objects can be tricky; what should the position be?  Are there cases where continuing
streaming reads on a file with an undefined position would be useful?  I suppose they could
reset the position to a known good one and continue reading.

Cancelling the file handle, but still able to efficiently do preads is a nice feature.  Cancelling
in-flight preads is another aspect to be handled (although that could be a different JIRA.

-Should the FileHandle have a callback when it knows that there are no pending operations?
Should be possible to just check the reference count on the CancelHandle to verify.
It would need to be holding a lock preventing additional operations for that to make sense,
and calling into consumer code while holding a lock is always a dangerous proposition.  What's
the use case for the callback?  Is it compelling?

> libhdfs++: Support async cancellation of read operations
> --------------------------------------------------------
>                 Key: HDFS-9643
>                 URL: https://issues.apache.org/jira/browse/HDFS-9643
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-9643.HDFS-8707.000.patch
> It should be possible for any thread to cancel operations in progress on a FileHandle.
 Any ephemeral objects created by the FileHandle should free resources as quickly as possible.

This message was sent by Atlassian JIRA

View raw message