Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C7CFEC8C for ; Fri, 7 Dec 2012 15:23:12 +0000 (UTC) Received: (qmail 86135 invoked by uid 500); 7 Dec 2012 15:23:07 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 85862 invoked by uid 500); 7 Dec 2012 15:23:07 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 85848 invoked by uid 99); 7 Dec 2012 15:23:06 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Dec 2012 15:23:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of peter.cogan@gmail.com designates 209.85.216.179 as permitted sender) Received: from [209.85.216.179] (HELO mail-qc0-f179.google.com) (209.85.216.179) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 07 Dec 2012 15:22:59 +0000 Received: by mail-qc0-f179.google.com with SMTP id b14so273096qcs.38 for ; Fri, 07 Dec 2012 07:22:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=9arU86vhRXS4FBwGhB3LglWKLgNf/pYy9V2xyP2UpOk=; b=WqtIv6qNAvolgL3BzvIXz8ASggndZbKhA7JDv6HqiQebNmNBrEsT7sApDIWiIF1cF8 nb6PMEc1ffYU7tKqQzQUCK2sAtkm/IHLjV2NSssHciFwp36LeT/vWYwKQhMQrJPE7VUm aeSftRyTS74xu6SGa4l92V3QkKRWQvoK7pPWcjBCgC+SJ0pDa7o94+wsCSUlxUoH2zC3 7hUQc8krSHWXq6W7mpYoKx+7EFRxiRfaalKs7ichaqaWqsBjo2Wjgcp2N5hm1IPHUGY+ gYz8sAwOn8rVsbH0CNd1B8JYLCe7v2MMuVPC/EvXrLVYP7DVAwjv3Wpg/QKVmKzjjn4j 6B+w== MIME-Version: 1.0 Received: by 10.224.179.212 with SMTP id br20mr9773312qab.51.1354893758876; Fri, 07 Dec 2012 07:22:38 -0800 (PST) Received: by 10.49.108.234 with HTTP; Fri, 7 Dec 2012 07:22:38 -0800 (PST) In-Reply-To: References: Date: Fri, 7 Dec 2012 15:22:38 +0000 Message-ID: Subject: Re: Problem using distributed cache From: Peter Cogan To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=20cf303b3c1134cb7304d044c9cd X-Virus-Checked: Checked by ClamAV on apache.org --20cf303b3c1134cb7304d044c9cd Content-Type: text/plain; charset=ISO-8859-1 Hi Dhaval & Harsh, thanks for coming back to the thread - you're both right I was doing things in the wrong order. I hadn't realised that Job constructor clones the configuration - that's very interesting! thanks again Peter On Fri, Dec 7, 2012 at 2:25 PM, Harsh J wrote: > Please try using job.getConfiguration() instead of the pre-job conf > instance, cause the constructor clones it. > > On Fri, Dec 7, 2012 at 7:36 PM, Peter Cogan wrote: > > Hi, > > > > any thoughts on this would be much appreciated > > > > thanks > > Peter > > > > > > On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan > wrote: > >> > >> Hi, > >> > >> It's an instance created at the start of the program like this: > >> > >> public static void main(String[] args) throws Exception { > >> > >> Configuration conf = new Configuration(); > >> > >> > >> Job job = new Job(conf, "wordcount"); > >> > >> > >> > >> DistributedCache.addCacheFile(new > URI("/user/peter/cacheFile/testCache1"), > >> conf); > >> > >> > >> > >> > >> On Thu, Dec 6, 2012 at 5:02 PM, Harsh J wrote: > >>> > >>> What is your conf object there? Is it job.getConfiguration() or an > >>> independent instance? > >>> > >>> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan > >>> wrote: > >>> > Hi , > >>> > > >>> > I want to use the distributed cache to allow my mappers to access > data. > >>> > In > >>> > main, I'm using the command > >>> > > >>> > DistributedCache.addCacheFile(new > >>> > URI("/user/peter/cacheFile/testCache1"), > >>> > conf); > >>> > > >>> > Where /user/peter/cacheFile/testCache1 is a file that exists in hdfs > >>> > > >>> > Then, my setup function looks like this: > >>> > > >>> > public void setup(Context context) throws IOException, > >>> > InterruptedException{ > >>> > Configuration conf = context.getConfiguration(); > >>> > Path[] localFiles = DistributedCache.getLocalCacheFiles(conf); > >>> > //etc > >>> > } > >>> > > >>> > However, this localFiles array is always null. > >>> > > >>> > I was initially running on a single-host cluster for testing, but I > >>> > read > >>> > that this will prevent the distributed cache from working. I tried > with > >>> > a > >>> > pseudo-distributed, but that didn't work either > >>> > > >>> > I'm using hadoop 1.0.3 > >>> > > >>> > thanks Peter > >>> > > >>> > > >>> > >>> > >>> > >>> -- > >>> Harsh J > >> > >> > > > > > > -- > Harsh J > --20cf303b3c1134cb7304d044c9cd Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Hi Dhaval & Harsh,=A0

thanks for coming back to the = thread - you're both right I was doing things in the wrong order. I had= n't realised that Job constructor clones the configuration - that's= very interesting!=A0

thanks again
Peter


On Fri, Dec 7= , 2012 at 2:25 PM, Harsh J <harsh@cloudera.com> wrote:
<= blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px= #ccc solid;padding-left:1ex"> Please try using job.getConfiguration() instead of the pre-job conf
instance, cause the constructor clones it.

On Fri, Dec 7, 2012 at 7:36 PM, Peter Cogan <peter.cogan@gmail.com> wrote:
> Hi,
>
> any thoughts on this would be much appreciated
>
> thanks
> Peter
>
>
> On Thu, Dec 6, 2012 at 9:29 PM, Peter Cogan <peter.cogan@gmail.com> wrote:
>>
>> Hi,
>>
>> It's an instance created at the start of the program like this= :
>>
>> public static void main(String[] args) throws Exception {
>>
>> Configuration conf =3D new Configuration();
>>
>>
>> Job job =3D new Job(conf, "wordcount");
>>
>>
>>
>> DistributedCache.addCacheFile(new URI("/user/peter/cacheFile/= testCache1"),
>> conf);
>>
>>
>>
>>
>> On Thu, Dec 6, 2012 at 5:02 PM, Harsh J <harsh@cloudera.com> wrote:
>>>
>>> What is your conf object there? Is it job.getConfiguration() o= r an
>>> independent instance?
>>>
>>> On Thu, Dec 6, 2012 at 10:29 PM, Peter Cogan <peter.cogan@gmail.com>
>>> wrote:
>>> > Hi ,
>>> >
>>> > I want to use the distributed cache to allow my mappers t= o access data.
>>> > In
>>> > main, I'm using the command
>>> >
>>> > DistributedCache.addCacheFile(new
>>> > URI("/user/peter/cacheFile/testCache1"),
>>> > conf);
>>> >
>>> > Where /user/peter/cacheFile/testCache1 is a file that exi= sts in hdfs
>>> >
>>> > Then, my setup function looks like this:
>>> >
>>> > public void setup(Context context) throws IOException, >>> > InterruptedException{
>>> > =A0 =A0 Configuration conf =3D context.getConfiguration()= ;
>>> > =A0 =A0 Path[] localFiles =3D DistributedCache.getLocalCa= cheFiles(conf);
>>> > =A0 =A0 //etc
>>> > }
>>> >
>>> > However, this localFiles array is always null.
>>> >
>>> > I was initially running on a single-host cluster for test= ing, but I
>>> > read
>>> > that this will prevent the distributed cache from working= . I tried with
>>> > a
>>> > pseudo-distributed, but that didn't work either
>>> >
>>> > I'm using hadoop 1.0.3
>>> >
>>> > thanks Peter
>>> >
>>> >
>>>
>>>
>>>
>>> --
>>> Harsh J
>>
>>
>



--
Harsh J

--20cf303b3c1134cb7304d044c9cd--