Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7286D901D for ; Fri, 12 Dec 2014 04:25:33 +0000 (UTC) Received: (qmail 7081 invoked by uid 500); 12 Dec 2014 04:25:28 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 6908 invoked by uid 500); 12 Dec 2014 04:25:28 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 6893 invoked by uid 99); 12 Dec 2014 04:25:27 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2014 04:25:27 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of shahab.yunus@gmail.com designates 209.85.217.178 as permitted sender) Received: from [209.85.217.178] (HELO mail-lb0-f178.google.com) (209.85.217.178) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 12 Dec 2014 04:25:22 +0000 Received: by mail-lb0-f178.google.com with SMTP id f15so5799996lbj.37 for ; Thu, 11 Dec 2014 20:25:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=KrmdDrFJOy6Ee9Ar6s1jHA8htFV2CV1Hc9k1xkacYLg=; b=tQcHnilciIUdPkwkH5aflTFjN7Vof44trint40q8Bh455VD24Dq+gPWeqKuPQAYuVG vnmEH5tuLFoy2r8ggzqEgXz+axmId8vkauRfX9EmJtyfW4FvNb7FIvvQBKAyNwN/zeuf Mb+hPJg75rhUHF9tOcPNRrQBsEseIWb701Muxx1nn83daZs+I3OZ5SbiJXjtYQu8+zwT 4HtPQOF5nNi3TWMnI6bYT5Sf7kwF5mX7akA1OVUDyw09G2GVLVlWVUxLK2Y80hNoBSo8 IBnMfMNdFHkDlTJ+jl4n09GDyYaGPSvI77/WgYkkfSW6TwaQgrK6ERePm1yyRTGkwdND ih1A== MIME-Version: 1.0 X-Received: by 10.112.204.193 with SMTP id la1mr13442144lbc.43.1418358301584; Thu, 11 Dec 2014 20:25:01 -0800 (PST) Received: by 10.25.39.76 with HTTP; Thu, 11 Dec 2014 20:25:01 -0800 (PST) In-Reply-To: References: Date: Thu, 11 Dec 2014 23:25:01 -0500 Message-ID: Subject: Re: DistributedCache From: Shahab Yunus To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=001a11c3d510baf4c50509fd46f8 X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3d510baf4c50509fd46f8 Content-Type: text/plain; charset=UTF-8 Look at this thread. It has alternatives to DistributedCache. http://stackoverflow.com/questions/21239722/hadoop-distributedcache-is-deprecated-what-is-the-preferred-api Basically you can use the new method job.addCacheFiles to pass on stuff to the individual tasks. Regards, Shahab On Thu, Dec 11, 2014 at 9:07 PM, Srinivas Chamarthi < srinivas.chamarthi@gmail.com> wrote: > > Hi, > > I want to cache map/reducer temporary output files so that I can compare > two map results coming from two different nodes to verify the integrity > check. > > I am simulating this use case with speculative execution by rescheduling > the first task as soon as it is started and running. > > Now I want to compare output files coming from speculative attempt and > prior attempt so that I can calculate the credit scoring of each node. > > I want to use DistributedCache to cache the local file system files in > CommitPending stage from TaskImpl. But the DistributedCache is actually > deprecated. is there any other way I can do this ? > > I think I can use HDFS to save the temporary output files so that other > nodes can see it ? but is there any in-memory solution I can use ? > > any pointers are greatly appreciated. > > thx & rgds, > srinivas chamarthi > --001a11c3d510baf4c50509fd46f8 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Look at this thread. It has alternatives to DistributedCac= he.
http://stackoverflow.c= om/questions/21239722/hadoop-distributedcache-is-deprecated-what-is-the-pre= ferred-api

Basically you can use the new method = job.addCacheFiles to pass on stuff to the individual tasks.

Regards,
Shahab

On Thu, Dec 11, 2014 at 9:07 PM, Sriniv= as Chamarthi <srinivas.chamarthi@gmail.com> wrote= :
Hi,

I w= ant to cache map/reducer temporary output files so that I can compare two m= ap results coming from two different nodes to verify the integrity check.= =C2=A0

I am simulating this use case with speculat= ive execution by rescheduling the first task as soon as it is started and r= unning.=C2=A0

Now I want to compare output files c= oming from speculative attempt and prior attempt so that I can calculate th= e credit scoring of each node.=C2=A0

I want to use= DistributedCache to cache the local file system files in CommitPending sta= ge from TaskImpl. But the DistributedCache is actually deprecated. is there= any other way I can do this ?=C2=A0

I think I can= use HDFS to save the temporary output files so that other nodes can see it= ? but is there any in-memory solution I can use ?=C2=A0

any pointers are greatly appreciated.=C2=A0

thx & rgds,
srinivas chamarthi
--001a11c3d510baf4c50509fd46f8--