Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 66771200AC8 for ; Tue, 7 Jun 2016 11:05:56 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 6512D160A36; Tue, 7 Jun 2016 09:05:56 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 842C2160A35 for ; Tue, 7 Jun 2016 11:05:55 +0200 (CEST) Received: (qmail 607 invoked by uid 500); 7 Jun 2016 09:05:53 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 596 invoked by uid 99); 7 Jun 2016 09:05:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2016 09:05:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 9C9FFC26F9 for ; Tue, 7 Jun 2016 09:05:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.429 X-Spam-Level: * X-Spam-Status: No, score=1.429 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id 0Tm7wN-5KAEn for ; Tue, 7 Jun 2016 09:05:51 +0000 (UTC) Received: from mail-it0-f47.google.com (mail-it0-f47.google.com [209.85.214.47]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 0323E5F257 for ; Tue, 7 Jun 2016 09:05:51 +0000 (UTC) Received: by mail-it0-f47.google.com with SMTP id h62so14466760itb.1 for ; Tue, 07 Jun 2016 02:05:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=+LiFi5wPW/xcU91gPHEelvPOTmm2QrIRgKLkGQvZm38=; b=WmDvwekvBbi7GaPCNsm6nD9D0R7E2kEpFEglKsZFSW0/gg3eDizhfaIKLAEGO89nrM UMO1ZW23PRsC0kBXASwmAixFERN4ynupivDtJLWGB0VznneyeLO3ZGeSEsjrQN51D8TM B6YKsorQLPFYEls6RSENB+AvQnJTRoWL1EJ2xBZcrKIAlhyxEVAJp4yCuhGZOKrvOMk5 99iia9KSyQT2QpPzLH8p1DcYrZDLM6LRQgb+c/U8uNU63lPAbOSSftb2jyRYCDAmxJWK oQlIm8cisI9B0McVwMuAfiqm/a7Yksc8K0fpDhTAVY9bHlTzhq7kLUT+GmO+jvzb7sJQ Bb5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=+LiFi5wPW/xcU91gPHEelvPOTmm2QrIRgKLkGQvZm38=; b=ST5eJUCMzDFvnHm/sQKi6+9tzjL/l3WeaIx3w1ZrW5i/w9ZZgSxFUzVplTWc8weRco qgXxErznoB+PrZYtpZrWwaIeserOxXH5s3GcHdEd26YMiWeVMoqTgoQYW/zUHEIo6mQ/ JSZBnrDskknXmMKFSA2WUk6bkLPCLAMGyNa3V+ZN303LCScdYG9kFjeL1n+dD4CQeNUW 7+x8jn/V5EdEYdXD0zwGW/KWkyQMhEzkSRN3rkJMu3OVZRImXHT3GL8HkoVFTvMW1DN9 PTJLt7wFdAVPflzUQqQ957Qlu52N6nSdV6lrHWxChjt20+xSuu6mC32mAn5S1E+hvC4s Aq1w== X-Gm-Message-State: ALyK8tIuiu7T6Wtdd2U1kSPPjO7mkee70L8MwHzPWKANnR4kYslBtELPU3ISwguyhR9brrE1EY1e4ERzo3psVA== X-Received: by 10.107.48.203 with SMTP id w194mr27068514iow.123.1465290344055; Tue, 07 Jun 2016 02:05:44 -0700 (PDT) MIME-Version: 1.0 Received: by 10.79.89.5 with HTTP; Tue, 7 Jun 2016 02:05:43 -0700 (PDT) From: Siddharth Dawar Date: Tue, 7 Jun 2016 14:35:43 +0530 Message-ID: Subject: Accessing files in Hadoop 2.7.2 Distributed Cache To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=001a11444fd47361fd0534ac7eed archived-at: Tue, 07 Jun 2016 09:05:56 -0000 --001a11444fd47361fd0534ac7eed Content-Type: text/plain; charset=UTF-8 Hi, I want to use the distributed cache to allow my mappers to access data in Hadoop 2.7.2. In main, I'm using the command String hdfs_path="hdfs://localhost:9000/bloomfilter";InputStream in = new BufferedInputStream(new FileInputStream("/home/siddharth/Desktop/data/bloom_filter"));Configuration conf = new Configuration();fs = FileSystem.get(java.net.URI.create(hdfs_path), conf);OutputStream out = fs.create(new Path(hdfs_path)); //Copy file from local to HDFSIOUtils.copyBytes(in, out, 4096, true); System.out.println(hdfs_path + " copied to HDFS");DistributedCache.addCacheFile(new Path(hdfs_path).toUri(), conf2); DistributedCache.addCacheFile(new Path(hdfs_path).toUri(), conf2); The above code adds a file present on my local file system to HDFS and adds it to the distributed cache. However, in my mapper code, when I try to access the file stored in distributed cache, the Path[] P variable gets null value. d public void configure(JobConf conf) { this.conf = conf; try { Path [] p=DistributedCache.getLocalCacheFiles(conf); } catch (IOException e) { // TODO Auto-generated catch block e.printStackTrace(); } } Even when I tried to access distributed cache from the following code in my mapper, the code returns the error that bloomfilter file doesn't exist strm = new DataInputStream(new FileInputStream("bloomfilter"));// Read into our Bloom filter.filter.readFields(strm);strm.close(); However, I read somewhere that if we add a file to distributed cache, we can access it directly from its name. Can you please help me out ? --001a11444fd47361fd0534ac7eed Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Hi,

I want to use the distributed cache = to allow my mappers to access data in Hadoop 2.7.2. In main, I'm using = the command

String hdfs_path=3D&=
quot;hdfs://localhost:9000/bloomfilter";
InputStream in =3D new BufferedInputStream(<=
span class=3D"">new FileInputStream("/home/siddharth/Desktop/data/bloom_f=
ilter"));
Configuration =
conf =3D n=
ew Configuration();
fs =3D FileSystem.get(java<=
/span>.net=
.URI.create(hdfs_path), conf);
OutputStream <=
span class=3D"">out =3D fs<=
/span>.create(new Path(hdfs_path));
						=20
//Copy file from local to HDFS
IOUtils.copyBytes(in, out, 4096, true);
						=20
System.out.println(hdfs_pat=
h + " copied to HDFS&q=
uot;);DistributedCache.addCacheFile(new Path=
(hdfs_path).toUri(), conf2);

DistributedCache.addCacheFile(new Path(hdfs_path).toUri(), conf2);=

The above code adds a file present on =
my local file system to HDFS and adds it to the distributed cache.

<= br>
However, in my mapper code, when I try to access the file sto=
red in distributed cache, the Path[] P variable gets null value. d 

public void configure(JobConf<= /span> conf) { this.conf =3D = conf; try { Path [] p<= span class=3D"">=3DDistributedCache.getLocalCacheFiles= (conf); } catch (IOException<= /span> e) = { // TODO Auto-gene= rated catch block e.printStackTrace();= } =09 }
<= pre>Even when I tried to access distributed cache from the following code
in my mapper, the code returns the error that bloomfilter file=
 doesn't exist

strm =3D= new DataInputStream= (new FileI= nputStream("bloomfilte= r")); // Read into our Bloo= m filter. filter.readFields(strm); strm.close();
However, I read somewhere that if we add a file to distributed =
cache, we can access it 
directly from its name.

Can you please help me out ?

--001a11444fd47361fd0534ac7eed--