hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Colin Patrick McCabe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6735) A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
Date Fri, 21 Nov 2014 08:21:35 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6735?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220659#comment-14220659
] 

Colin Patrick McCabe commented on HDFS-6735:
--------------------------------------------

bq. Another locking option is not to synchronize on <this> at all, but to have two locks
("streamLock" and "pLock", or whatever are good names). That way the intend might be more
explicit.  Yet another option would be to disentangle to two apis by subclassing or delegation
(since the issue really is that we have state for two different modes of operation in the
same class), that'd be a bigger change though.

Yeah, I thought about that too.  It seems cleaner, intuitively, but it would also involve
a lot of re-indentation, since we could not have "synchronized" methods any more, but would
have to have synchronized blocks on the "positional lock".  It seems like such a trivial thing,
but changing indentation can be painful.

I'm fine with committing something like the current patch, if we can make the changes I suggested
above.  We can think about additional cleanups in a follow-on JIRA, I guess.  I know you guys
have spent a lot of time on this and it's important for HBase perf.

bq. Tested this with HBase and observed with a sampler that all delays internal to DFSInputStream
are gone, which is nice.

awesome!

> A minor optimization to avoid pread() be blocked by read() inside the same DFSInputStream
> -----------------------------------------------------------------------------------------
>
>                 Key: HDFS-6735
>                 URL: https://issues.apache.org/jira/browse/HDFS-6735
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>         Attachments: HDFS-6735-v2.txt, HDFS-6735-v3.txt, HDFS-6735.txt
>
>
> In current DFSInputStream impl, there're a couple of coarser-grained locks in read/pread
path, and it has became a HBase read latency pain point so far. In HDFS-6698, i made a minor
patch against the first encourtered lock, around getFileLength, in deed, after reading code
and testing, it shows still other locks we could improve.
> In this jira, i'll make a patch against other locks, and a simple test case to show the
issue and the improved result.
> This is important for HBase application, since in current HFile read path, we issue all
read()/pread() requests in the same DFSInputStream for one HFile. (Multi streams solution
is another story i had a plan to do, but probably will take more time than i expected)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message