hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Liang Xie (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-5664) try to relieve the BlockReaderLocal read() synchronized hotspot
Date Sat, 14 Dec 2013 12:12:07 GMT

    [ https://issues.apache.org/jira/browse/HDFS-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13848340#comment-13848340

Liang Xie commented on HDFS-5664:

bq. since there would still be a big "synchronized" on all the DFSInputStream#read methods
which use the BlockReader
This can be fixed by HDFS-1605, e.g. use a read lock for read()

bq. If multiple threads want to read the same file at the same time, they can open multiple
distinct streams for it. At that point, they're not sharing the same BlockReader, so whether
or not BRL is synchronized doesn't matter.
yes, this is a feasible idea. 
But in current HBase codebase, we use only one stream(or two streams considering checksum
or not in old version) for one HFile.So seems here is a critical performance issue. we should
try to figure out is it possible to remove the synchronized keyword in BlockReader or we must
consider to use multiple thread pattern. [~stack], do you familiar with here: why HBase use
one stream always for one HFile in history´╝č
I'll try to understand some background here as well.

> try to relieve the BlockReaderLocal read() synchronized hotspot
> ---------------------------------------------------------------
>                 Key: HDFS-5664
>                 URL: https://issues.apache.org/jira/browse/HDFS-5664
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
> Current the BlockReaderLocal's read has a synchronized modifier:
> {code}
> public synchronized int read(byte[] buf, int off, int len) throws IOException {
> {code}
> In a HBase physical read heavy cluster, we observed some hotspots from dfsclient path,
the detail strace trace could be found from: https://issues.apache.org/jira/browse/HDFS-1605?focusedCommentId=13843241&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241
> I haven't looked into the detail yet, put some raw ideas here firstly:
> 1) replace synchronized with try lock with timeout pattern, so could fail-fast,  2) fallback
to non-ssr mode if get a local reader lock failed.
> There're two suitable scenario at least to remove this hotspot:
> 1) Local physical read heavy, e.g. HBase block cache miss ratio is high
> 2) slow/bad disk.
> It would be helpful to achive a lower 99th percentile HBase read latency somehow.

This message was sent by Atlassian JIRA

View raw message