hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "heyongqiang" <heyongqi...@software.ict.ac.cn>
Subject Re: Re: understanding of client connection code
Date Mon, 23 Jun 2008 06:38:08 GMT
hehe

    I notices that in the DFSClient's DataStreamer thread, the run method is sending data
out with synchronized on the dataqueue, is this really need?
I mean remove,wait,and getFirst of variable dataQueue should be synchronized on the dataQueue,but
is it need to hold a lock when send one packet out?
I doubt. Can any developer give me one reason for doing that?




heyongqiang
2008-06-23



发件人: hong
发送时间: 2008-06-21 10:10:59
收件人: core-user@hadoop.apache.org
抄送: 
主题: Re: understanding of client connection code

兄弟是 余海燕 的部队吗?

在 2008-6-20,下午5:00,heyongqiang 写道:

> ipc.Client object is designed be able to share across threads, and  
> each thread can only made synchronized rpc call,which means each  
> thread call and wait for a result or error.This is implemented by a  
> novel technique:each thread made distinct call(with different call  
> object),the user thread then wait at his call object which later  
> will be notified by the connection receiver thread.The user thread  
> made a call by first add his call object into the call list which  
> later be used by the response receiver,and synchronized at the  
> connection's socket outputstream waiting for writing his call out.  
> And the connection's thread is running to collect response on  
> behalf of all user threads.
> which i have not mentioned is that Client actually maintains a  
> connection table.
> In every Client object ,a connection culler is running behind as a  
> daemon,which's sole purpose is to remove idel connection from the  
> connection table,
> but it seems that this culler thread does not close the socket the  
> connection associated with,it only make a mark and do a notify. all  
> the clean staff is handled by the connection thread itself.This is  
> really a wonderful design! even the culler thread can culled the  
> connection from the table, the connection thread also includes  
> remove code. That's because there is chance that the connection  
> thread would encounter some exception.
>
> The above is a brief summary of  my understanding of hadoop's ipc  
> code.
> The below is a test result which is used to test the data  
> throughput of hadoop:
> +--------------+------------------+
> | threadCounts | avg(averageRate) |
> +--------------+------------------+
> |            1 |   53030539.48913 |
> |            2 |  35325499.583756 |
> |            3 |  24998284.969072 |
> |            4 |   19824934.28125 |
> |            5 |  15956391.489583 |
> |            6 |  15948640.175532 |
> |            7 |  14623977.375691 |
> |            8 |  16098080.160131 |
> |            9 |  8967970.3877005 |
> |           10 |  14569087.178947 |
> |           11 |  8962683.6662088 |
> |           12 |  20063735.297872 |
> |           13 |  13174481.053977 |
> |           14 |  10137907.034188 |
> |           15 |  6464513.2013889 |
> |           16 |   23064338.76087 |
> |           17 |   18688537.44385 |
> |           18 |  18270909.854317 |
> |           19 |  13086261.536538 |
> |           20 |  10784059.367347 |
> +--------------+------------------+
>
> the first column represents the thread counts of my test  
> application, the second column is the average download rate.It  
> seems the rate download sharply when the thread count increases.
> This is very simple test application.Anyone can tell me why?where  
> is the bottleneck when user app adopt multiple thread.
>
>
>
>
> heyongqiang
> 2008-06-20
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message