Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Sat, 14 Dec 2013 12:12:07 +0000 (UTC)
From: "Liang Xie (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: <JIRA.12684412.1386908929142.37293.1387023127834@arcas>
In-Reply-To: <JIRA.12684412.1386908929142@arcas>
References: <JIRA.12684412.1386908929142@arcas>
Subject: [jira] [Commented] (HDFS-5664) try to relieve the BlockReaderLocal
 read() synchronized hotspot
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable


    [ https://issues.apache.org/jira/browse/HDFS-5664?page=3Dcom.atlassian.=
jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D13848=
340#comment-13848340 ]=20

Liang Xie commented on HDFS-5664:
---------------------------------

bq. since there would still be a big "synchronized" on all the DFSInputStre=
am#read methods which use the BlockReader
This can be fixed by HDFS-1605, e.g. use a read lock for read()

bq. If multiple threads want to read the same file at the same time, they c=
an open multiple distinct streams for it. At that point, they're not sharin=
g the same BlockReader, so whether or not BRL is synchronized doesn't matte=
r.
yes, this is a feasible idea.=20
But in current HBase codebase, we use only one stream(or two streams consid=
ering checksum or not in old version) for one HFile.So seems here is a crit=
ical performance issue. we should try to figure out is it possible to remov=
e the synchronized keyword in BlockReader or we must consider to use multip=
le thread pattern. [~stack], do you familiar with here: why HBase use one s=
tream always for one HFile in history=EF=BC=9F
I'll try to understand some background here as well.

> try to relieve the BlockReaderLocal read() synchronized hotspot
> ---------------------------------------------------------------
>
>                 Key: HDFS-5664
>                 URL: https://issues.apache.org/jira/browse/HDFS-5664
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs-client
>    Affects Versions: 3.0.0, 2.2.0
>            Reporter: Liang Xie
>            Assignee: Liang Xie
>
> Current the BlockReaderLocal's read has a synchronized modifier:
> {code}
> public synchronized int read(byte[] buf, int off, int len) throws IOExcep=
tion {
> {code}
> In a HBase physical read heavy cluster, we observed some hotspots from df=
sclient path, the detail strace trace could be found from: https://issues.a=
pache.org/jira/browse/HDFS-1605?focusedCommentId=3D13843241&page=3Dcom.atla=
ssian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13843241
> I haven't looked into the detail yet, put some raw ideas here firstly:
> 1) replace synchronized with try lock with timeout pattern, so could fail=
-fast,  2) fallback to non-ssr mode if get a local reader lock failed.
> There're two suitable scenario at least to remove this hotspot:
> 1) Local physical read heavy, e.g. HBase block cache miss ratio is high
> 2) slow/bad disk.
> It would be helpful to achive a lower 99th percentile HBase read latency =
somehow.


--
This message was sent by Atlassian JIRA
(v6.1.4#6159)