hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6803) Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent context
Date Tue, 12 Aug 2014 18:03:12 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6803?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094410#comment-14094410

Colin Patrick McCabe commented on HDFS-6803:

S3 might have a seek+read+seek implementation of pread right now, but that doesn't mean that
it has to be that way forever.  If it had something like HDFS's BlockReader abstraction, it
could easily support preads which didn't affect the output of getPos, and preads that were
concurrent.  The s3 protocol itself lets you start reading an object at any offset you want.

It seems like we are all in agreement on Stack's point 2.1 (positional read and non-positional
can run concurrently), and 2.2 (two or more positional reads can run concurrently)?  Perhaps
we should just document those and move some of the other discussions to follow-up JIRAs. 
This is an important performance improvement for HBase and it seems like we have consensus
on the concurrent pread issue at least.

bq. There's always the strategy of adding a marker interface, say ConcurrentPositionalReads,
which indicates the operations are concurrent. Would it help HBase if this were added and
looked for?

Interesting, that might help use HBase on s3 or other alternate fses...

> Documenting DFSClient#DFSInputStream expectations reading and preading in concurrent
> --------------------------------------------------------------------------------------------
>                 Key: HDFS-6803
>                 URL: https://issues.apache.org/jira/browse/HDFS-6803
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>    Affects Versions: 2.4.1
>            Reporter: stack
>         Attachments: DocumentingDFSClientDFSInputStream (1).pdf
> Reviews of the patch posted the parent task suggest that we be more explicit about how
DFSIS is expected to behave when being read by contending threads. It is also suggested that
presumptions made internally be made explicit documenting expectations.
> Before we put up a patch we've made a document of assertions we'd like to make into tenets
of DFSInputSteam.  If agreement, we'll attach to this issue a patch that weaves the assumptions
into DFSIS as javadoc and class comments. 

This message was sent by Atlassian JIRA

View raw message