Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: core-dev@hadoop.apache.org
Received-SPF: neutral (nike.apache.org: local policy)
DomainKey-Signature: a=rsa-sha1; s=serpent; d=yahoo-inc.com; c=nofws; q=dns;
	h=message-id:date:from:user-agent:mime-version:to:cc:subject:
	references:in-reply-to:content-type:content-transfer-encoding;
	b=LliiK/FdQaTzkV1dk8GQqvmNxSnKLCdiFijSagTxPhjOz86Qph4JFgjdwPwMwT2q
Message-ID: <47D8430C.8060702@yahoo-inc.com>
Date: Wed, 12 Mar 2008 13:54:37 -0700
From: Sanjay Radia <sradia@yahoo-inc.com>
User-Agent: Thunderbird 2.0.0.4 (Macintosh/20070604)
MIME-Version: 1.0
To: core-dev@hadoop.apache.org
CC: hadoop-dev@lucene.apache.org
Subject: Re: Multiplexing sockets in DFSClient/datanodes?
References: <C3FD7A75.22A2%hairong@yahoo-inc.com>
In-Reply-To: <C3FD7A75.22A2%hairong@yahoo-inc.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

Hairong Kuang wrote:
>> streaming IO lets to pipe large amounts
>> of data without the request/response exchange.
>> The worry was that IO performance would degrade.
>>     
>
> Since hadoop-2188 removes ipc timeout, it is ok that a datanode responses a
> datanode up in the pipeline when it gets a response from a datanode down in
> the pipeline. If datanodes could have two threads, one pushing data down to
> the pipeline and one writing it to the local disk, using RPC won't introduce
> any additional communication cost.
>   

I believe that is what our pipe line code does.
The client, however will block for the reply unless we change the client 
code to have multiple buffers etc.
> Hairong
>
> On 3/12/08 11:35 AM, "Sanjay Radia" <sradia@yahoo-inc.com> wrote:
>
>   
>> Doug Cutting wrote:
>>     
>>> Jim Kellerman wrote:
>>>       
>>>> Yes, multiplexing a socket is more complicated than having one socket
>>>> per file, but saving system resources seems like a way to scale.
>>>>
>>>> Questions? Comments? Opinions? Flames?
>>>>         
>>> Note that Hadoop RPC already multiplexes, sharing a single socket per
>>> pair of JVMs.  It would be possible to multiplex datanode, and should
>>> not in theory significantly impact performance, but, as you indicate,
>>> it would be a significant change.  One approach might be to implement
>>> HDFS data access using RPC rather than directly using stream i/o.
>>>
>>> RPC also tears down idle connections, which HDFS does not.  I wonder
>>> how much doing that alone might help your case?  That would probably
>>> be much simpler to implement.  Both client and server must already
>>> handle connection failures, so it shouldn't be too great of a change
>>> to have one or both sides actively close things down if they're idle
>>> for more than a few seconds.  This is related to adding write timeouts
>>> to the datanode (HADOOP-2346).
>>>       
>> Doug,
>>    Dhruba and I had discussed using RPC in the past. While RPC is a
>> cleaner interface and our rpc implementation has
>> features such sharing connection, closing idle connections etc,
>> streaming IO lets to pipe large amounts
>> of data without the request/response exchange.
>> The worry was that IO performance would degrade.
>> BTW, NFS uses rpc (NFS does not have the write pipeline for replicas)
>>
>> sanjay
>>     
>>> Doug
>>>       
>
>