Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8F404E0AC for ; Wed, 26 Dec 2012 10:11:51 +0000 (UTC) Received: (qmail 38657 invoked by uid 500); 26 Dec 2012 10:11:46 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 38452 invoked by uid 500); 26 Dec 2012 10:11:46 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 38425 invoked by uid 99); 26 Dec 2012 10:11:45 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 10:11:45 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of linlma@gmail.com designates 209.85.220.172 as permitted sender) Received: from [209.85.220.172] (HELO mail-vc0-f172.google.com) (209.85.220.172) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 26 Dec 2012 10:11:37 +0000 Received: by mail-vc0-f172.google.com with SMTP id fw7so8862379vcb.3 for ; Wed, 26 Dec 2012 02:11:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=sW/nnmMQFkMW32Uun1le8WHVf6QH7olvgI791Na5PU8=; b=u8F+Q0czCZWLcnZZIkNsXNYoMGNpqrBEBrjqXzl2b2aE5PRnpIrWRmHq2yIgFCEP28 0sIIJw2p71Sm6Ku7iGVAgCYoA7Xvstji25ESd4P5KQEhFo0Zj4vzUP6JgwOXeeXk6n9T /UGAGMd0//5i4/HF57Z7bzEfG2q6oLRclvylNm15woKF/L+lHWLfoxgPPWi7inDImees 4v1RJV50IQ1mpF86fYLSPgRTUkD3WRaCu7I+2JgTF56CWOr/MPFhyM6o/9hxBFwfD2Ng +Vjsqo1qv4GYHWjjGRtOzhRIOTPDnnuSu5X8KdNHyPz8FbIs2U0cC4A4q3MF1J4y1G/m On9g== MIME-Version: 1.0 Received: by 10.58.18.239 with SMTP id z15mr42135541ved.27.1356516676086; Wed, 26 Dec 2012 02:11:16 -0800 (PST) Received: by 10.58.68.135 with HTTP; Wed, 26 Dec 2012 02:11:15 -0800 (PST) In-Reply-To: References: <491FA550-FC92-4280-8FB5-186E5F7A4743@123.org> Date: Wed, 26 Dec 2012 18:11:15 +0800 Message-ID: Subject: Re: distributed cache From: Lin Ma To: user@hadoop.apache.org, Harsh J Content-Type: multipart/alternative; boundary=047d7b6d91169c2a7404d1bea69d X-Virus-Checked: Checked by ClamAV on apache.org --047d7b6d91169c2a7404d1bea69d Content-Type: text/plain; charset=ISO-8859-1 Thanks Harsh, Supposing DistributedCache is uploaded by client, for each replica, in Hadoop design, it could only serve one download session (download from a mapper or a reducer which requires the DistributedCache) at a time until DistributedCache file download is completed, or it could serve multiple concurrent parallel download session (download from multiple mappers or reducers which requires the DistributedCache). regards, Lin On Wed, Dec 26, 2012 at 4:51 PM, Harsh J wrote: > Hi Lin, > > DistributedCache files are stored onto the HDFS by the client first. > The TaskTrackers download and localize it. Therefore, as with any > other file on HDFS, "downloads" can be efficiently parallel with > higher replicas. > > The point of having higher replication for these files is also tied to > the concept of racks in a cluster - you would want more replicas > spread across racks such that on task bootup the downloads happen with > rack locality. > > On Sat, Dec 22, 2012 at 6:54 PM, Lin Ma wrote: > > Hi Kai, > > > > Smart answer! :-) > > > > The assumption you have is one distributed cache replica could only serve > > one download session for tasktracker node (this is why you get > concurrency > > n/r). The question is, why one distributed cache replica cannot serve > > multiple concurrent download session? For example, supposing a > tasktracker > > use elapsed time t to download a file from a specific distributed cache > > replica, it is possible for 2 tasktrackers to download from the specific > > distributed cache replica in parallel using elapsed time t as well, or > 1.5 > > t, which is faster than sequential download time 2t you mentioned before? > > "In total, r+n/r concurrent operations. If you optimize r depending on n, > > SRQT(n) is the optimal replication level." -- how do you get SRQT(n) for > > minimize r+n/r? Appreciate if you could point me to more details. > > > > regards, > > Lin > > > > > > On Sat, Dec 22, 2012 at 8:51 PM, Kai Voigt wrote: > >> > >> Hi, > >> > >> simple math. Assuming you have n TaskTrackers in your cluster that will > >> need to access the files in the distributed cache. And r is the > replication > >> level of those files. > >> > >> Copying the files into HDFS requires r copy operations over the network. > >> The n TaskTrackers need to get their local copies from HDFS, so the n > >> TaskTrackers copy from r DataNodes, so n/r concurrent operation. In > total, > >> r+n/r concurrent operations. If you optimize r depending on n, SRQT(n) > is > >> the optimal replication level. So 10 is a reasonable default setting for > >> most clusters that are not 500+ nodes big. > >> > >> Kai > >> > >> Am 22.12.2012 um 13:46 schrieb Lin Ma : > >> > >> Thanks Kai, using higher replication count for the purpose of? > >> > >> regards, > >> Lin > >> > >> On Sat, Dec 22, 2012 at 8:44 PM, Kai Voigt wrote: > >>> > >>> Hi, > >>> > >>> Am 22.12.2012 um 13:03 schrieb Lin Ma : > >>> > >>> > I want to confirm when on each task node either mapper or reducer > >>> > access distributed cache file, it resides on disk, not resides in > memory. > >>> > Just want to make sure distributed cache file does not fully loaded > into > >>> > memory which compete memory consumption with mapper/reducer tasks. > Is that > >>> > correct? > >>> > >>> > >>> Yes, you are correct. The JobTracker will put files for the distributed > >>> cache into HDFS with a higher replication count (10 by default). > Whenever a > >>> TaskTracker needs those files for a task it is launching locally, it > will > >>> fetch a copy to its local disk. So it won't need to do this again for > future > >>> tasks on this node. After a job is done, all local copies and the HDFS > >>> copies of files in the distributed cache are cleaned up. > >>> > >>> Kai > >>> > >>> -- > >>> Kai Voigt > >>> k@123.org > >>> > >>> > >>> > >>> > >> > >> > >> -- > >> Kai Voigt > >> k@123.org > >> > >> > >> > >> > > > > > > -- > Harsh J > --047d7b6d91169c2a7404d1bea69d Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Thanks Harsh,

Supposing DistributedCache is uploaded by client, for = each replica, in Hadoop design, it could only serve one download session (d= ownload from a mapper or a reducer which requires the DistributedCache) at = a time until DistributedCache file download is completed, or it could serve= multiple concurrent parallel download session (download from multiple mapp= ers or reducers which requires the DistributedCache).

regards,
Lin
=A0
On Wed, Dec 26, 20= 12 at 4:51 PM, Harsh J <harsh@cloudera.com> wrote:
Hi Lin,

DistributedCache files are stored onto the HDFS by the client first.
The TaskTrackers download and localize it. Therefore, as with any
other file on HDFS, "downloads" can be efficiently parallel with<= br> higher replicas.

The point of having higher replication for these files is also tied to
the concept of racks in a cluster - you would want more replicas
spread across racks such that on task bootup the downloads happen with
rack locality.

On Sat, Dec 22, 2012 at 6:54 PM, Lin Ma <linlma@gmail.com> wrote:
> Hi Kai,
>
> Smart answer! :-)
>
> The assumption you have is one distributed cache replica could only se= rve
> one download session for tasktracker node (this is why you get concurr= ency
> n/r). The question is, why one distributed cache replica cannot serve<= br> > multiple concurrent download session? For example, supposing a tasktra= cker
> use elapsed time t to download a file from a specific distributed cach= e
> replica, it is possible for 2 tasktrackers to download from the specif= ic
> distributed cache replica in parallel using elapsed time t as well, or= 1.5
> t, which is faster than sequential download time 2t you mentioned befo= re?
> "In total, r+n/r concurrent operations. If you optimize r dependi= ng on n,
> SRQT(n) is the optimal replication level." -- how do you get SRQT= (n) for
> minimize r+n/r? Appreciate if you could point me to more details.
>
> regards,
> Lin
>
>
> On Sat, Dec 22, 2012 at 8:51 PM, Kai Voigt <k@123.org> wrote:
>>
>> Hi,
>>
>> simple math. Assuming you have n TaskTrackers in your cluster that= will
>> need to access the files in the distributed cache. And r is the re= plication
>> level of those files.
>>
>> Copying the files into HDFS requires r copy operations over the ne= twork.
>> The n TaskTrackers need to get their local copies from HDFS, so th= e n
>> TaskTrackers copy from r DataNodes, so n/r concurrent operation. I= n total,
>> r+n/r concurrent operations. If you optimize r depending on n, SRQ= T(n) is
>> the optimal replication level. So 10 is a reasonable default setti= ng for
>> most clusters that are not 500+ nodes big.
>>
>> Kai
>>
>> Am 22.12.2012 um 13:46 schrieb Lin Ma <linlma@gmail.com>:
>>
>> Thanks Kai, using higher replication count for the purpose of?
>>
>> regards,
>> Lin
>>
>> On Sat, Dec 22, 2012 at 8:44 PM, Kai Voigt <k@123.org> wrote:
>>>
>>> Hi,
>>>
>>> Am 22.12.2012 um 13:03 schrieb Lin Ma <linlma@gmail.com>:
>>>
>>> > I want to confirm when on each task node either mapper or= reducer
>>> > access distributed cache file, it resides on disk, not re= sides in memory.
>>> > Just want to make sure distributed cache file does not fu= lly loaded into
>>> > memory which compete memory consumption with mapper/reduc= er tasks. Is that
>>> > correct?
>>>
>>>
>>> Yes, you are correct. The JobTracker will put files for the di= stributed
>>> cache into HDFS with a higher replication count (10 by default= ). Whenever a
>>> TaskTracker needs those files for a task it is launching local= ly, it will
>>> fetch a copy to its local disk. So it won't need to do thi= s again for future
>>> tasks on this node. After a job is done, all local copies and = the HDFS
>>> copies of files in the distributed cache are cleaned up.
>>>
>>> Kai
>>>
>>> --
>>> Kai Voigt
>>> k@123.org >>>
>>>
>>>
>>>
>>
>>
>> --
>> Kai Voigt
>> k@123.org
>>
>>
>>
>>
>



--
Harsh J

--047d7b6d91169c2a7404d1bea69d--