Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9F022200B6D for ; Tue, 19 Jul 2016 05:09:22 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9B879160A5D; Tue, 19 Jul 2016 03:09:22 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EE0EE160A65 for ; Tue, 19 Jul 2016 05:09:21 +0200 (CEST) Received: (qmail 12794 invoked by uid 500); 19 Jul 2016 03:09:21 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 12685 invoked by uid 99); 19 Jul 2016 03:09:20 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 19 Jul 2016 03:09:20 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C826C2C029E for ; Tue, 19 Jul 2016 03:09:20 +0000 (UTC) Date: Tue, 19 Jul 2016 03:09:20 +0000 (UTC) From: "Zhihua Deng (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-16212) Many connections to datanode are created when doing a large scan MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 19 Jul 2016 03:09:22 -0000 [ https://issues.apache.org/jira/browse/HBASE-16212?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D153= 83514#comment-15383514 ]=20 Zhihua Deng commented on HBASE-16212: ------------------------------------- Thanks [~stack]. From the logging, it implies that different threads share = the same DFSInputStream instance, say 'defaultRpcServer.handler=3D7'(handle= r7) and 'defaultRpcServer.handler=3D4'(handler4), for example. The original= will prefect the next block header and cache the header into thread. When = defaultRpcServer.handler=3D4 comes, it first checks that the cached header= offset is equal to the the block starting offset, unfortunately these two = numbers are unequal(-1 !=3D offset). The handler4 knows nothing about the b= lock header, though the header has been prefected by handler7. The handle= r4 needs to seek the inputstream with the block starting offset for obtaini= ng the header, while the inputstream has been over read by 33 bytes(the he= ader size). So a new connection to datanode should be recreated, the elder = one will be closed. When the datanode writes to a closed channel, an socket= exception will be raised. When the same case happens frequently, the datan= ode will be suffered from logging the message described as it is. > Many connections to datanode are created when doing a large scan=20 > ----------------------------------------------------------------- > > Key: HBASE-16212 > URL: https://issues.apache.org/jira/browse/HBASE-16212 > Project: HBase > Issue Type: Improvement > Affects Versions: 1.1.2 > Reporter: Zhihua Deng > Attachments: HBASE-16212.patch, HBASE-16212.v2.patch, regionserve= r-dfsinputstream.log > > > As described in https://issues.apache.org/jira/browse/HDFS-8659, the data= node is suffering from logging the same repeatedly. Adding log to DFSInputS= tream, it outputs as follows: > 2016-07-10 21:31:42,147 INFO [B.defaultRpcServer.handler=3D22,queue=3D1,= port=3D16020] hdfs.DFSClient: DFSClient_NONMAPREDUCE_1984924661_1 seek Data= nodeInfoWithStorage[10.130.1.29:50010,DS-086bc494-d862-470c-86e8-9cb7929985= c6,DISK] for BP-360285305-10.130.1.11-1444619256876:blk_1109360829_35627143= . pos: 111506876, targetPos: 111506843 > ... > As the pos of this input stream is larger than targetPos(the pos trying t= o seek), A new connection to the datanode will be created, the older one wi= ll be closed as a consequence. When the wrong seeking ops are large, the da= tanode's block scanner info message is spamming logs, as well as many conne= ctions to the same datanode will be created. > hadoop version: 2.7.1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)