Return-Path: X-Original-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7BDBD6DAE for ; Thu, 9 Jun 2011 07:42:24 +0000 (UTC) Received: (qmail 406 invoked by uid 500); 9 Jun 2011 07:42:24 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 380 invoked by uid 500); 9 Jun 2011 07:42:24 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 372 invoked by uid 99); 9 Jun 2011 07:42:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jun 2011 07:42:24 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Jun 2011 07:42:20 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 254DE10A433 for ; Thu, 9 Jun 2011 07:41:59 +0000 (UTC) Date: Thu, 9 Jun 2011 07:41:59 +0000 (UTC) From: "stack (JIRA)" To: hdfs-issues@hadoop.apache.org Message-ID: <900179408.5793.1307605319149.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HDFS-941) Datanode xceiver protocol should allow reuse of a connection MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13046374#comment-13046374 ] stack commented on HDFS-941: ---------------------------- Dang. Did more testing (w/ Todd's help). I backported his patch to 0.22 so could run my loadings. I see this every so often in dn logs 'Got error for OP_READ_BLOCK' (perhaps once every ten minutes per server). The other side of the connection will print 'Client /10.4.9.34did not send a valid status code after reading. Will close connection' (I'll see this latter message much more frequently than the former but it seems fine -- we are just closing the connection and moving on w/ no repercussions client-side). Here is more context. In the datanode log (Look for 'Client /10.4.9.34did not...'): {code} 2011-06-08 23:39:45,759 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_-1043418802690508828_7206 of size 16207176 from /10.4.9.34:57333 2011-06-08 23:39:45,759 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 2 for block blk_-1043418802690508828_7206 terminating 2011-06-08 23:39:45,960 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_5716868613634466961_7207 src: /10.4.14.34:39560 dest: /10.4.9.34:10010 2011-06-08 23:39:46,301 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_5716868613634466961_7207 of size 29893370 from /10.4.14.34:39560 2011-06-08 23:39:46,301 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_5716868613634466961_7207 terminating 2011-06-08 23:39:46,326 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_-7242346463849737969_7208 src: /10.4.14.34:39564 dest: /10.4.9.34:10010 2011-06-08 23:39:46,434 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection. 2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection. 2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection. 2011-06-08 23:39:46,435 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Client /10.4.9.34did not send a valid status code after reading. Will close connection. 2011-06-08 23:39:47,837 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_-7242346463849737969_7208 of size 67108864 from /10.4.14.34:39564 2011-06-08 23:39:47,837 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_-7242346463849737969_7208 terminating 2011-06-08 23:39:47,855 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving block blk_7820819556875770048_7208 src: /10.4.14.34:39596 dest: /10.4.9.34:10010 2011-06-08 23:39:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Received block blk_7820819556875770048_7208 of size 67108864 from /10.4.14.34:39596 2011-06-08 23:39:49,212 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder 1 for block blk_7820819556875770048_7208 terminating {code} In the regionserver log (the client): {code} 2011-06-08 23:39:45,777 INFO org.apache.hadoop.hbase.regionserver.Store: Completed compaction of 4 file(s) in values of usertable,user617882364,1307559813504. e4a9ed69f909762ddba8027cb6438575.; new storefile name=hdfs://sv4borg227:10000/hbase/usertable/e4a9ed69f909762ddba8027cb6438575/values/6552772398789018757, size=143.5m; total size for store is 488.4m 2011-06-08 23:39:45,777 INFO org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: completed compaction: regionName=usertable,user617882364,1307559813504. e4a9ed69f909762ddba8027cb6438575., storeName=values, fileCount=4, fileSize=175.5m, priority=2, date=Wed Jun 08 23:39:41 PDT 2011; duration=3sec 2011-06-08 23:39:45,777 DEBUG org.apache.hadoop.hbase.regionserver.compactions.CompactionRequest: CompactSplitThread Status: compaction_queue=(0:0), split_queue=0 2011-06-08 23:39:46,436 WARN org.apache.hadoop.hdfs.DFSClient: Failed to connect to /10.4.9.34:10010 for file /hbase/usertable/e4a9ed69f909762ddba8027cb6438575/values/ 5422279471660943029 for block blk_1325488162553537841_6905:java.io.IOException: Got error for OP_READ_BLOCK, self=/10.4.9.34:57345, remote=/10.4.9.34:10010, for file /hbase/ usertable/e4a9ed69f909762ddba8027cb6438575/values/5422279471660943029, for block 1325488162553537841_6905 at org.apache.hadoop.hdfs.BlockReader.newBlockReader(BlockReader.java:437) at org.apache.hadoop.hdfs.DFSInputStream.getBlockReader(DFSInputStream.java:727) at org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:618) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:781) at org.apache.hadoop.fs.FSDataInputStream.read(FSDataInputStream.java:51) at org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(BoundedRangeFileInputStream.java:101) at java.io.BufferedInputStream.read1(BufferedInputStream.java:256) at java.io.BufferedInputStream.read(BufferedInputStream.java:317) at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:122) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1139) at org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:1081) {code} > Datanode xceiver protocol should allow reuse of a connection > ------------------------------------------------------------ > > Key: HDFS-941 > URL: https://issues.apache.org/jira/browse/HDFS-941 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, hdfs client > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: bc Wong > Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch, HDFS-941-4.patch, HDFS-941-5.patch, HDFS-941-6.22.patch, HDFS-941-6.patch, HDFS-941-6.patch, HDFS-941-6.patch, hdfs-941.txt, hdfs-941.txt, hdfs-941.txt, hdfs941-1.png > > > Right now each connection into the datanode xceiver only processes one operation. > In the case that an operation leaves the stream in a well-defined state (eg a client reads to the end of a block successfully) the same connection could be reused for a second operation. This should improve random read performance significantly. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira