Return-Path: Delivered-To: apmail-hadoop-common-user-archive@www.apache.org Received: (qmail 49735 invoked from network); 12 Jul 2010 21:56:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 12 Jul 2010 21:56:14 -0000 Received: (qmail 66284 invoked by uid 500); 12 Jul 2010 21:56:12 -0000 Delivered-To: apmail-hadoop-common-user-archive@hadoop.apache.org Received: (qmail 66054 invoked by uid 500); 12 Jul 2010 21:56:12 -0000 Mailing-List: contact common-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-user@hadoop.apache.org Delivered-To: mailing list common-user@hadoop.apache.org Received: (qmail 66046 invoked by uid 99); 12 Jul 2010 21:56:11 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Jul 2010 21:56:11 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: local policy) Received: from [74.115.26.24] (HELO util2.sjc1.3crowd.com) (74.115.26.24) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 Jul 2010 21:56:04 +0000 Received: from [74.115.25.116] (helo=dyn116.3crowd.com) by util2.sjc1.3crowd.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1OYQyV-0007b1-7x for common-user@hadoop.apache.org; Mon, 12 Jul 2010 14:55:43 -0700 From: David Hawthorne Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Subject: map.input.file in 20.1 Date: Mon, 12 Jul 2010 14:56:16 -0700 Message-Id: To: common-user@hadoop.apache.org Mime-Version: 1.0 (Apple Message framework v1081) X-Mailer: Apple Mail (2.1081) X-Virus-Checked: Checked by ClamAV on apache.org I'm trying to get the name of the file that the map job is operating on = out of the Context passed to the setup function. It's proving harder = than seems proper. I've found several links via google on this topic, but I've seen no = responses to previous questions. We have this from July 17, 2009: http://www.mail-archive.com/common-user@hadoop.apache.org/msg00535.html I attempted that solution and javac complained about using a deprecated = API. It's very clearly spelled out in this doc: http://hadoop.apache.org/common/docs/r0.20.1/mapred_tutorial.html and yet the example source code for 20.1 is still using the mapred.* = (deprecated) API that the prior link used as well. For the record, here's what I've tried, in the hopes that someone will = just paste back a working solution: import java.io.IOException; import org.apache.hadoop.fs.Path; import org.apache.hadoop.io.Text; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.util.GenericOptionsParser; import org.apache.hadoop.mapreduce.Job; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.mapreduce.RecordWriter; import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; import org.apache.hadoop.mapreduce.lib.reduce.IntSumReducer; import org.apache.hadoop.mapred.FileSplit; public class Foo { public static class FooMapper extends Mapper { private org.apache.hadoop.io.Text input_file; public void setup (Context context) { Configuration conf =3D = context.getConfiguration(); =20 // // fails to compile due to use of deprecated = mapred API: // FileSplit fileSplit =3D = (FileSplit)context.getInputSplit(); String input_fname =3D = fileSplit.getPath().toString(); input_file.set(input_fname); // // results in null pointer exception because = conf.get returns null: // // input_file.set(conf.get("map.input.file")); } } }