Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 72525 invoked from network); 12 Mar 2008 20:55:24 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 12 Mar 2008 20:55:24 -0000 Received: (qmail 51643 invoked by uid 500); 12 Mar 2008 20:55:19 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 51585 invoked by uid 500); 12 Mar 2008 20:55:19 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 51566 invoked by uid 500); 12 Mar 2008 20:55:19 -0000 Delivered-To: apmail-lucene-hadoop-dev@lucene.apache.org Received: (qmail 51560 invoked by uid 99); 12 Mar 2008 20:55:19 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Mar 2008 13:55:19 -0700 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [216.145.54.173] (HELO mrout3.yahoo.com) (216.145.54.173) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 12 Mar 2008 20:54:28 +0000 Received: from oceanfarearth-lm.corp.yahoo.com (oceanfarearth-lm.corp.yahoo.com [10.72.113.156]) by mrout3.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id m2CKsXGb030956; Wed, 12 Mar 2008 13:54:33 -0700 (PDT) DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=LliiK/FdQaTzkV1dk8GQqvmNxSnKLCdiFijSagTxPhjOz86Qph4JFgjdwPwMwT2q Message-ID: <47D8430C.8060702@yahoo-inc.com> Date: Wed, 12 Mar 2008 13:54:37 -0700 From: Sanjay Radia User-Agent: Thunderbird 2.0.0.4 (Macintosh/20070604) MIME-Version: 1.0 To: core-dev@hadoop.apache.org CC: hadoop-dev@lucene.apache.org Subject: Re: Multiplexing sockets in DFSClient/datanodes? References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hairong Kuang wrote: >> streaming IO lets to pipe large amounts >> of data without the request/response exchange. >> The worry was that IO performance would degrade. >> > > Since hadoop-2188 removes ipc timeout, it is ok that a datanode responses a > datanode up in the pipeline when it gets a response from a datanode down in > the pipeline. If datanodes could have two threads, one pushing data down to > the pipeline and one writing it to the local disk, using RPC won't introduce > any additional communication cost. > I believe that is what our pipe line code does. The client, however will block for the reply unless we change the client code to have multiple buffers etc. > Hairong > > On 3/12/08 11:35 AM, "Sanjay Radia" wrote: > > >> Doug Cutting wrote: >> >>> Jim Kellerman wrote: >>> >>>> Yes, multiplexing a socket is more complicated than having one socket >>>> per file, but saving system resources seems like a way to scale. >>>> >>>> Questions? Comments? Opinions? Flames? >>>> >>> Note that Hadoop RPC already multiplexes, sharing a single socket per >>> pair of JVMs. It would be possible to multiplex datanode, and should >>> not in theory significantly impact performance, but, as you indicate, >>> it would be a significant change. One approach might be to implement >>> HDFS data access using RPC rather than directly using stream i/o. >>> >>> RPC also tears down idle connections, which HDFS does not. I wonder >>> how much doing that alone might help your case? That would probably >>> be much simpler to implement. Both client and server must already >>> handle connection failures, so it shouldn't be too great of a change >>> to have one or both sides actively close things down if they're idle >>> for more than a few seconds. This is related to adding write timeouts >>> to the datanode (HADOOP-2346). >>> >> Doug, >> Dhruba and I had discussed using RPC in the past. While RPC is a >> cleaner interface and our rpc implementation has >> features such sharing connection, closing idle connections etc, >> streaming IO lets to pipe large amounts >> of data without the request/response exchange. >> The worry was that IO performance would degrade. >> BTW, NFS uses rpc (NFS does not have the write pipeline for replicas) >> >> sanjay >> >>> Doug >>> > >