Return-Path: X-Original-To: apmail-hadoop-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B5D35F70C for ; Thu, 28 Mar 2013 15:40:38 +0000 (UTC) Received: (qmail 41310 invoked by uid 500); 28 Mar 2013 15:40:34 -0000 Delivered-To: apmail-hadoop-user-archive@hadoop.apache.org Received: (qmail 40818 invoked by uid 500); 28 Mar 2013 15:40:33 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 40795 invoked by uid 99); 28 Mar 2013 15:40:32 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 15:40:32 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of yanbohappy@gmail.com designates 209.85.215.52 as permitted sender) Received: from [209.85.215.52] (HELO mail-la0-f52.google.com) (209.85.215.52) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 28 Mar 2013 15:40:26 +0000 Received: by mail-la0-f52.google.com with SMTP id fs12so17699289lab.25 for ; Thu, 28 Mar 2013 08:40:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:content-type; bh=z3Kmngjr6pGG4bYCM7obUr0CgngncfSf0tEp6+AhiZg=; b=C5hcmNLgR8K5SVJpwy5gBBdESjHIz6zYNCWLaDz8fhN6HXHgp7rGw0dchSW+bCFVWj fAvKiQKZ2B3auVjDa7kozFGBIhlDOO43yDxGqhEObNFBgaL/hpUV6GVXDyuWnuScmas2 PV643BWT+mE4moO9buYIiF7Wz1W7ffNKqZMp5bDWgCXy2Aq9hZcAqVsBkGuk0tCJxzGK MKbnMGLmBt5aUe99NJjoxmjf89mTchYf+5HA0O5JjBcHXWrIt0nI1lfi+FuqSriNkX3k Z9sVzDjREJdQJyw7eKCt0sSY+Azt+SVuHWhDR20DEQF9oCx7R+c35P1q3+lHDjzTEjSj IfYw== MIME-Version: 1.0 X-Received: by 10.112.9.231 with SMTP id d7mr12429365lbb.8.1364485205433; Thu, 28 Mar 2013 08:40:05 -0700 (PDT) Received: by 10.114.11.129 with HTTP; Thu, 28 Mar 2013 08:40:05 -0700 (PDT) In-Reply-To: References: Date: Thu, 28 Mar 2013 23:40:05 +0800 Message-ID: Subject: Re: DFSOutputStream.sync() method latency time From: Yanbo Liang To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=e0cb4efe29d2f8b16a04d8fdf7cc X-Virus-Checked: Checked by ClamAV on apache.org --e0cb4efe29d2f8b16a04d8fdf7cc Content-Type: text/plain; charset=ISO-8859-1 1st when client wants to write data to HDFS, it should be create DFSOutputStream. Then the client write data to this output stream and this stream will transfer data to all DataNodes with the constructed pipeline by the means of Packet whose size is 64KB. These two operations is concurrent, so the write latency is not simple superposition. 2nd the sync method only flush the last packet ( at most 64KB ) data to the pipeline. Because of the cocurrent processing of all these operations, so the latency is smaller than the superposition of each operation. It's parallel computing rather than serial computing in a sense. 2013/3/28 lei liu > When client write data, if there are three replicates, the sync method > latency time formula should be: > sync method latency time = first datanode receive data time + sencond > datanode receive data time + third datanode receive data time. > > if the three datanode receive data time all are 2 millisecond, so the sync > method latency time should is 6 millisecond, but according to our our > monitor, the the sync method latency time is 2 millisecond. > > > How to calculate sync method latency time? > > > Thanks, > > LiuLei > > --e0cb4efe29d2f8b16a04d8fdf7cc Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable 1st when client wants to write data to HDFS, it should be create DFSOutputS= tream.
Then the client write data to this output stream and this stream= will transfer data to all DataNodes with the constructed pipeline by the m= eans of Packet whose size is 64KB.=A0
These two operations is concurrent, so the write latency is not simple= superposition.

2nd the sync method only flush the= last packet ( at most 64KB ) data to the pipeline.

Because of the cocurrent processing of all these operations, so the la= tency is smaller than the superposition of each operation.
It'= ;s parallel computing rather than serial computing in a sense.


2013/3/28 lei liu <liule= i412@gmail.com>
When client=A0 write data, if there ar= e three replicates,=A0 the sync method latency time formula should be:
s= ync method=A0 latency time =3D first datanode receive data time + sencond d= atanode receive data=A0 time +=A0 third datanode receive data time.

if the three datanode receive data time all are 2 millisecond, so= the sync method=A0 latency time should is 6 millisecond,=A0 but according = to our our monitor, the the sync method=A0 latency time is 2 millisecond.

How to calculate sync method=A0 latency time?


Thanks,

LiuLei

<= /div>

--e0cb4efe29d2f8b16a04d8fdf7cc--