Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: hdfs-issues@hadoop.apache.org
Date: Thu, 9 Jun 2011 07:41:59 +0000 (UTC)
From: "stack (JIRA)" <jira@apache.org>
To: hdfs-issues@hadoop.apache.org
Message-ID: 
 <900179408.5793.1307605319149.JavaMail.tomcat@hel.zones.apache.org>
Subject: [jira] [Commented] (HDFS-941) Datanode xceiver protocol should
 allow reuse of a connection
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit


    [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046374#comment-13046374 ] 

stack commented on HDFS-941:
----------------------------

Dang.  Did more testing (w/ Todd's help).  I backported his patch to 0.22 so could run my loadings.  I see this every so often in dn logs 'Got error for OP_READ_BLOCK' (perhaps once every ten minutes per server).  The other side of the connection will print 'Client /10.4.9.34did not send a valid status code after reading. Will close connection' (I'll see this latter message much more frequently than the former but it seems fine -- we are just closing the connection and moving on w/ no repercussions client-side).

Here is more context.

In the datanode log (Look for 'Client /10.4.9.34did not...'): 

{code}
2011-06-08 23:39:45,759 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_-1043418802690508828_7206 of size 16207176 from /10.4.9.34:57333
2011-06-08 23:39:45,759 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block blk_-1043418802690508828_7206 terminating                                  2011-06-08 23:39:45,960 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5716868613634466961_7207 src: /10.4.14.34:39560 dest: /10.4.9.34:10010
2011-06-08 23:39:46,301 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_5716868613634466961_7207 of size 29893370 from /10.4.14.34:39560                    2011-06-08 23:39:46,301 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_5716868613634466961_7207 terminating
2011-06-08 23:39:46,326 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-7242346463849737969_7208 src: /10.4.14.34:39564 dest: /10.4.9.34:10010            2011-06-08 23:39:46,434 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection.
2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection.                2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection.
2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection.                2011-06-08 23:39:47,837 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_-7242346463849737969_7208 of size 67108864 from /10.4.14.34:39564                   2011-06-08 23:39:47,837 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_-7242346463849737969_7208 terminating
2011-06-08 23:39:47,855 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_7820819556875770048_7208 src: /10.4.14.34:39596 dest: /10.4.9.34:10010             2011-06-08 23:39:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_7820819556875770048_7208 of size 67108864 from /10.4.14.34:39596
2011-06-08 23:39:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_7820819556875770048_7208 terminating
{code}

In the regionserver log (the client):


{code}
2011-06-08 23:39:45,777 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 4 file(s) in values of usertable,user617882364,1307559813504.                       e4a9ed69f909762ddba8027cb6438575.; new storefile name=hdfs://sv4borg227:10000/hbase/usertable/e4a9ed69f909762ddba8027cb6438575/values/6552772398789018757, size=143.5m; total size   for store is 488.4m
2011-06-08 23:39:45,777 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=usertable,user617882364,1307559813504.             e4a9ed69f909762ddba8027cb6438575., storeName=values, fileCount=4, fileSize=175.5m, priority=2, date=Wed Jun 08 23:39:41 PDT 2011; duration=3sec
2011-06-08 23:39:45,777 DEBUG org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: CompactSplitThread Status: compaction_queue=(0:0), split_queue=0
2011-06-08 23:39:46,436 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.4.9.34:10010 for file /hbase/usertable/e4a9ed69f909762ddba8027cb6438575/values/              5422279471660943029 for block blk_1325488162553537841_6905:java.io.IOException: Got error for OP_READ_BLOCK, self=/10.4.9.34:57345, remote=/10.4.9.34:10010, for file /hbase/        usertable/e4a9ed69f909762ddba8027cb6438575/values/5422279471660943029, for block 1325488162553537841_6905
    at org.apache.hadoop.hdfs.BlockReader.newBlockReader(BlockReader.java:437)
    at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:727)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:618)
    at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:781)
    at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51)
    at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101)
    at java.io.BufferedInputStream.read1(BufferedInputStream.java:256)
    at java.io.BufferedInputStream.read(BufferedInputStream.java:317)
    at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:122)
    at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1139)
    at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1081)

{code}

> Datanode xceiver protocol should allow reuse of a connection
> ------------------------------------------------------------
>
>                 Key: HDFS-941
>                 URL: https://issues.apache.org/jira/browse/HDFS-941
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: data-node, hdfs client
>    Affects Versions: 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: bc Wong
>         Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.22.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch, hdfs-941.txt, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png
>
>
> Right now each connection into the datanode xceiver only processes one operation.
> In the case that an operation leaves the stream in a well-defined state (eg a client reads to the end of a block successfully) the same connection could be reused for a second operation. This should improve random read performance significantly.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira