Return-Path: Delivered-To: apmail-hadoop-hdfs-issues-archive@minotaur.apache.org Received: (qmail 70456 invoked from network); 29 Apr 2010 22:00:28 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 29 Apr 2010 22:00:28 -0000 Received: (qmail 20905 invoked by uid 500); 29 Apr 2010 22:00:28 -0000 Delivered-To: apmail-hadoop-hdfs-issues-archive@hadoop.apache.org Received: (qmail 20881 invoked by uid 500); 29 Apr 2010 22:00:28 -0000 Mailing-List: contact hdfs-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: hdfs-issues@hadoop.apache.org Delivered-To: mailing list hdfs-issues@hadoop.apache.org Received: (qmail 20873 invoked by uid 99); 29 Apr 2010 22:00:28 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Apr 2010 22:00:27 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Apr 2010 22:00:25 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id o3TM03Bb013485 for ; Thu, 29 Apr 2010 22:00:03 GMT Message-ID: <13898549.20391272578403835.JavaMail.jira@thor> Date: Thu, 29 Apr 2010 18:00:03 -0400 (EDT) From: "Eli Collins (JIRA)" To: hdfs-issues@hadoop.apache.org Subject: [jira] Commented: (HDFS-941) Datanode xceiver protocol should allow reuse of a connection In-Reply-To: <1502456391.35441265148139236.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HDFS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12862439#action_12862439 ] Eli Collins commented on HDFS-941: ---------------------------------- Hey bc, Nice change! Do you have any results from a non-random workload? Please collect: # before/after TestDFSIO runs so we can see if sequential throughput is affected # hadoop fs -put of a 1g file from n clients in parallel. I suspect this will improve, socket resuse should limit slow start but good to check. How did you choose DEFAULT_CACHE_SIZE? In the exception handler in sendReadResult can we be more specific about when it's OK not to be able to send the result, and throw an exception in the cases when it's no OK rather than swallowing all IOExceptions? In DataXceiver#opReadBlock you throw an IOException in a try block that catches IOException. I think that should LOG.error and close the output stream. You can also chain the following if statements that check stat. How about asserting sock != null in putCachedSocket? Seems like this should never happen if the code is correct and it's easy to ignore logs. File a jira for ERROR_CHECKSUM? Please add a comment to the head of ReaderSocketCache explaining why we cache BlockReader socket pairs, as opposed to just caching sockets (because we don't multiplex BlockReaders over a single socket between hosts). Nits: * Nice comment in the BlockReader header, please define "packet" as well. Is the RPC specification in DataNode outdated? If so fix it or file a jira instead of warning readers it may be outdated. * Maybe better name for DN_KEEPALIVE_TIMEOUT since there is no explicit keepalive? TRANSFER_TIMEOUT? * Would rename workDone to something specific like opsProcessed or make it a boolean * Add an "a" in "with checksum" * if needs braces eg BlockReader#read Thanks, Eli > Datanode xceiver protocol should allow reuse of a connection > ------------------------------------------------------------ > > Key: HDFS-941 > URL: https://issues.apache.org/jira/browse/HDFS-941 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, hdfs client > Affects Versions: 0.22.0 > Reporter: Todd Lipcon > Assignee: bc Wong > Attachments: HDFS-941-1.patch, HDFS-941-2.patch, HDFS-941-3.patch, HDFS-941-3.patch > > > Right now each connection into the datanode xceiver only processes one operation. > In the case that an operation leaves the stream in a well-defined state (eg a client reads to the end of a block successfully) the same connection could be reused for a second operation. This should improve random read performance significantly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.