Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 55667F3F1 for ; Fri, 29 Mar 2013 06:05:31 +0000 (UTC) Received: (qmail 43575 invoked by uid 500); 29 Mar 2013 06:05:26 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 43441 invoked by uid 500); 29 Mar 2013 06:05:26 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 43414 invoked by uid 99); 29 Mar 2013 06:05:25 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 06:05:25 +0000 X-ASF-Spam-Status: No, hits=1.7 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of liulei412@gmail.com designates 209.85.128.182 as permitted sender) Received: from [209.85.128.182] (HELO mail-ve0-f182.google.com) (209.85.128.182) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 29 Mar 2013 06:05:18 +0000 Received: by mail-ve0-f182.google.com with SMTP id m1so274077ves.27 for ; Thu, 28 Mar 2013 23:04:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=wZvTcRRzotLsyD3Daz+vs7f/jTGAnIs3wjpFo1LzunQ=; b=fy4gnMBWvnAzD3wC54n6bfHogCc+kORdympAsf8fUtEqLyl+NHgZDYtU10CbuwL0qy qdEsmJeMKVzQa+JSo7ytFr8r3E5pZFaPs88GrsgX+1Xhi92F0FOAqyUpKw+SO6+MOBbi 2HrsZUREVcdI69YNfwm0gbW4ErN7g6PQRf0kZUn+fgYYUX6XBMuK/BqOzv9rKC7K5JQJ E54D799GqgibXPnuNVMRz4fpFASsN6j6x8qex2IWvT9jTueJmLaWE0nHBydDuyUk5UGZ ILVCpBaanpsMyM6qtwudy0ONBQdbWp7AMh45j6PUyb/ExZKspcz53UIBMK7IXUi4Uz0c MYVw== MIME-Version: 1.0 X-Received: by 10.220.88.145 with SMTP id a17mr916196vcm.66.1364537097246; Thu, 28 Mar 2013 23:04:57 -0700 (PDT) Received: by 10.220.69.142 with HTTP; Thu, 28 Mar 2013 23:04:56 -0700 (PDT) In-Reply-To: References: Date: Fri, 29 Mar 2013 14:04:56 +0800 Message-ID: Subject: Re: DFSOutputStream.sync() method latency time From: lei liu To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=047d7b3a8f14f6de2304d90a0c7a X-Virus-Checked: Checked by ClamAV on apache.org --047d7b3a8f14f6de2304d90a0c7a Content-Type: text/plain; charset=ISO-8859-1 The sync method include below code: // Flush only if we haven't already flushed till this offset. if (lastFlushOffset != bytesCurBlock) { assert bytesCurBlock > lastFlushOffset; // record the valid offset of this flush lastFlushOffset = bytesCurBlock; enqueueCurrentPacket(); } When there are 64k data in memory, the write method call enqueueCurrentPacket method send one package to pipeline. But when the data in memory are less than 64K, the write method don't call enqueueCurrentPacket method, so the write method don't send data to pipeline, and then client call sync method, the sync method call enqueueCurrentPacket method send data to pipeline, and wait ack info. 2013/3/29 Yanbo Liang > "The write method write data to memory of client, the sync method send > package to pipeline" I thin you made a mistake for understanding the write > procedure of HDFS. > > It's right that the write method write data to memory of client, however > the data in the client memory is sent to DataNodes at the time when it was > filled to the client memory. This procedure is finished by another thread, > so it's concurrent operation. > > sync method has the same operation except for it is used for the last > packet in the stream. It waits until have received ack from DataNodes. > > The write method and sync method is not concurrent. The write method or > sync method is concurrent with the backend thread which is used to transfer > data to DataNodes. > > And I guess you can understand Chinese, so I recommend you to read one of > my blog(http://yanbohappy.sinaapp.com/?p=143) and it explain the write > workflow detail. > > > 2013/3/29 lei liu > >> Thanks Yanbo for your reply. >> >> I test code are : >> FSDataOutputStream outputStream = fs.create(path); >> Random r = new Random(); >> long totalBytes = 0; >> String str = new String(new byte[1024]); >> while(totalBytes < 1024 * 1024 * 500) { >> byte[] bytes = ("start_"+r.nextLong() +"_" + str + >> r.nextLong()+"_end" + "\n").getBytes(); >> outputStream.write(bytes); >> outputStream.sync(); >> totalBytes = totalBytes + bytes.length; >> } >> outputStream.close(); >> >> >> The write method and sync method is synchronized, so the two method is >> not cocurrent. >> >> The write method write data to memory of client, the sync method send >> package to pipelien, client can execute write method until the sync >> method return sucess, so I think the sync method latency time should be >> equal with superposition of each datanode operation. >> >> >> >> >> 2013/3/28 Yanbo Liang >> >>> 1st when client wants to write data to HDFS, it should be create >>> DFSOutputStream. >>> Then the client write data to this output stream and this stream will >>> transfer data to all DataNodes with the constructed pipeline by the means >>> of Packet whose size is 64KB. >>> These two operations is concurrent, so the write latency is not simple >>> superposition. >>> >>> 2nd the sync method only flush the last packet ( at most 64KB ) data to >>> the pipeline. >>> >>> Because of the cocurrent processing of all these operations, so the >>> latency is smaller than the superposition of each operation. >>> It's parallel computing rather than serial computing in a sense. >>> >>> >>> 2013/3/28 lei liu >>> >>>> When client write data, if there are three replicates, the sync >>>> method latency time formula should be: >>>> sync method latency time = first datanode receive data time + sencond >>>> datanode receive data time + third datanode receive data time. >>>> >>>> if the three datanode receive data time all are 2 millisecond, so the >>>> sync method latency time should is 6 millisecond, but according to our >>>> our monitor, the the sync method latency time is 2 millisecond. >>>> >>>> >>>> How to calculate sync method latency time? >>>> >>>> >>>> Thanks, >>>> >>>> LiuLei >>>> >>>> >>> >> > --047d7b3a8f14f6de2304d90a0c7a Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

The sync method include below code:=A0 // Flush only if we haven't already flushed till this offset.
= =A0=A0=A0=A0=A0=A0=A0=A0=A0 if (lastFlushOffset !=3D bytesCurBlock) {
= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 assert bytesCurBlock > lastFlushOffset= ;
=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 // record the valid offset of this flush<= br>=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 lastFlushOffset =3D bytesCurBlock;
= =A0=A0=A0=A0=A0=A0=A0=A0=A0=A0=A0 enqueueCurrentPacket();
}=A0=A0

When there are 64k data in memory, the write method call=A0 enqueueCu= rrentPacket method send one package to pipeline.=A0 But when the data in me= mory are less than 64K, the write method don't call enqueueCurrentPack= et method, so the write method don't send data to pipeline, and then cl= ient call sync method, the sync method call enqueueCurrentPacket method sen= d data to pipeline, and wait ack info.





2013/3/29 Yanbo Liang <yanbohappy@gmail.com><= br>
"The write method write data to memory of client, the sync method send= package to pipeline" I thin you made a mistake for understanding the = write procedure of HDFS.

It's right that the write m= ethod write data to memory of client, however the data in the client memory= is sent to DataNodes at the time when it was filled to the client memory. = This procedure is finished by another thread, so it's concurrent operat= ion.

sync method has the same operation except for it is use= d for the last packet in the stream. It waits until have received ack from = DataNodes.

The write method and sync method is not= concurrent. The write method or sync method is concurrent with the backend= thread which is used to transfer data to DataNodes.

And I guess you can understand Chinese, so I recommend = you to read one of my blog(http://yanbohappy.sinaapp.com/?p=3D143) and it exp= lain the write workflow detail.


2013/3/29 lei liu <liulei412@gmail.com>
Thanks Yanbo for your reply.

I= =A0 test code are :
=A0=A0=A0=A0=A0=A0=A0 FSDataOutputStream outputStrea= m =3D fs.create(path);
=A0=A0=A0 =A0=A0=A0 Random r =3D new Random();=A0=A0=A0 =A0=A0=A0 long totalBytes =3D 0;
=A0=A0=A0 =A0=A0=A0 String str =3D=A0 new String(new byte[1024]);
=A0=A0= =A0 =A0=A0=A0 while(totalBytes < 1024 * 1024 * 500) {
=A0=A0=A0 =A0= =A0=A0 =A0 byte[] bytes =3D ("start_"+r.nextLong() +"_"= + str + r.nextLong()+"_end" + "\n").getBytes();
=A0=A0=A0 =A0=A0=A0 =A0 outputStream.write(bytes);
=A0=A0=A0 =A0=A0=A0 = =A0 outputStream.sync();
=A0=A0=A0 =A0=A0=A0 =A0 totalBytes =3D totalByt= es + bytes.length;
=A0=A0=A0 =A0=A0=A0 }
=A0=A0=A0 =A0=A0=A0 outputSt= ream.close();


The write method and sync method is synchron= ized, so the two method is not cocurrent.

The write method write data to memory of client, the sync method = send package to pipelien,=A0 client can execute write=A0 method=A0 until th= e=A0 sync method return sucess,=A0 so I=A0 think the sync method latency t= ime should be equal with superposition of each datanode operation.




2013/3/28 Yanbo Liang <= span dir=3D"ltr"><yanbohappy@gmail.com>
1st when client wants to write data to HDFS,= it should be create DFSOutputStream.
Then the client write data to thi= s output stream and this stream will transfer data to all DataNodes with th= e constructed pipeline by the means of Packet whose size is 64KB.=A0
These two operations is concurrent, so the write latency is not simple= superposition.

2nd the sync method only flush the= last packet ( at most 64KB ) data to the pipeline.

Because of the cocurrent processing of all these operations, so the la= tency is smaller than the superposition of each operation.
It'= ;s parallel computing rather than serial computing in a sense.


2013/3/28 lei liu <liule= i412@gmail.com>
When client=A0 write data, if there ar= e three replicates,=A0 the sync method latency time formula should be:
s= ync method=A0 latency time =3D first datanode receive data time + sencond d= atanode receive data=A0 time +=A0 third datanode receive data time.

if the three datanode receive data time all are 2 millisecond, so= the sync method=A0 latency time should is 6 millisecond,=A0 but according = to our our monitor, the the sync method=A0 latency time is 2 millisecond.

How to calculate sync method=A0 latency time?


Thanks,

LiuLei

<= /div>




--047d7b3a8f14f6de2304d90a0c7a--