hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zesheng Wu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-6596) Improve InputStream when read spans two blocks
Date Wed, 25 Jun 2014 02:46:26 GMT

    [ https://issues.apache.org/jira/browse/HDFS-6596?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042986#comment-14042986

Zesheng Wu commented on HDFS-6596:

Thanks Colin.
bq.  What you are proposing is basically making every {{read}} into a {{readFully}}. I don't
think we want to increase the number of differences between how DFSInputStream works and how
"normal" Java input streams work. The "normal" java behavior also has a good reason behind
it... clients who can deal with partial reads will get a faster response time if the stream
just returns what it can rather than waiting for everything. In the case of HDFS, waiting
for everything might mean connecting to a remote DataNode. This could be quite a lot of latency.
I agree with you that we shouldn't make every {{read}} into a {{readFully}}, and the current
implementation of {{read}} has its advantage as you described.

About the solution, I think that we do it in Hadoop will be better, because all users will
be benefited.
The current {{readFully}} for DFSInputStream is implemented as pread and inherits from FSInputStream,
so I will a new {{readFully(buffer, offset, length)}} to figure this out.  Any thoughts?

> Improve InputStream when read spans two blocks
> ----------------------------------------------
>                 Key: HDFS-6596
>                 URL: https://issues.apache.org/jira/browse/HDFS-6596
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 2.4.0
>            Reporter: Zesheng Wu
>            Assignee: Zesheng Wu
> In the current implementation of DFSInputStream, read(buffer, offset, length) is implemented
as following:
> {code}
> int realLen = (int) Math.min(len, (blockEnd - pos + 1L));
> if (locatedBlocks.isLastBlockComplete()) {
>   realLen = (int) Math.min(realLen, locatedBlocks.getFileLength());
> }
> int result = readBuffer(strategy, off, realLen, corruptedBlockMap);
> {code}
> From the above code, we can conclude that the read will return at most (blockEnd - pos
+ 1) bytes. As a result, when read spans two blocks, the caller must call read() second time
to complete the request, and must wait second time to acquire the DFSInputStream lock(read()
is synchronized for DFSInputStream). For latency sensitive applications, such as hbase, this
will result in latency pain point when they under massive race conditions. So here we propose
that we should loop internally in read() to do best effort read.
> In the current implementation of pread(read(position, buffer, offset, lenght)), it does
loop internally to do best effort read. So we can refactor to support this on normal read.

This message was sent by Atlassian JIRA

View raw message