Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm
Precedence: bulk
Date: Tue, 19 Jul 2016 03:09:20 +0000 (UTC)
From: "Zhihua Deng (JIRA)" <jira@apache.org>
To: issues@hbase.apache.org
Message-ID: <JIRA.12988520.1468293564000.68056.1468897760816@Atlassian.JIRA>
In-Reply-To: <JIRA.12988520.1468293564000@Atlassian.JIRA>
References: <JIRA.12988520.1468293564000@Atlassian.JIRA> <JIRA.12988520.1468293564442@arcas>
Subject: [jira] [Commented] (HBASE-16212) Many connections to datanode are
 created when doing a large scan
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
archived-at: Tue, 19 Jul 2016 03:09:22 -0000


    [ https://issues.apache.org/jira/browse/HBASE-16212?page=3Dcom.atlassia=
n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D153=
83514#comment-15383514 ]=20

Zhihua Deng commented on HBASE-16212:
-------------------------------------

Thanks [~stack]. From the logging, it implies that different threads share =
the same DFSInputStream instance, say 'defaultRpcServer.handler=3D7'(handle=
r7) and 'defaultRpcServer.handler=3D4'(handler4), for example. The original=
 will prefect the next block header and cache the header into thread. When =
defaultRpcServer.handler=3D4 comes,  it first checks that the cached header=
 offset is equal to the the block starting offset, unfortunately these two =
numbers are unequal(-1 !=3D offset). The handler4 knows nothing about the b=
lock header,  though the header has been prefected by handler7.  The handle=
r4 needs to seek the inputstream with the block starting offset for obtaini=
ng the header,  while the inputstream has been over read by 33 bytes(the he=
ader size). So a new connection to datanode should be recreated, the elder =
one will be closed. When the datanode writes to a closed channel, an socket=
 exception will be raised. When the same case happens frequently, the datan=
ode will be suffered from logging the message described as it is.

> Many connections to datanode are created when doing a large scan=20
> -----------------------------------------------------------------
>
>                 Key: HBASE-16212
>                 URL: https://issues.apache.org/jira/browse/HBASE-16212
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 1.1.2
>            Reporter: Zhihua Deng
>         Attachments: HBASE-16212.patch, HBASE-16212.v2.patch, regionserve=
r-dfsinputstream.log
>
>
> As described in https://issues.apache.org/jira/browse/HDFS-8659, the data=
node is suffering from logging the same repeatedly. Adding log to DFSInputS=
tream, it outputs as follows:
> 2016-07-10 21:31:42,147 INFO  [B.defaultRpcServer.handler=3D22,queue=3D1,=
port=3D16020] hdfs.DFSClient: DFSClient_NONMAPREDUCE_1984924661_1 seek Data=
nodeInfoWithStorage[10.130.1.29:50010,DS-086bc494-d862-470c-86e8-9cb7929985=
c6,DISK] for BP-360285305-10.130.1.11-1444619256876:blk_1109360829_35627143=
. pos: 111506876, targetPos: 111506843
>  ...
> As the pos of this input stream is larger than targetPos(the pos trying t=
o seek), A new connection to the datanode will be created, the older one wi=
ll be closed as a consequence. When the wrong seeking ops are large, the da=
tanode's block scanner info message is spamming logs, as well as many conne=
ctions to the same datanode will be created.
> hadoop version: 2.7.1


--
This message was sent by Atlassian JIRA
(v6.3.4#6332)