Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3FC7610137 for ; Sun, 17 May 2015 14:37:24 +0000 (UTC) Received: (qmail 51106 invoked by uid 500); 17 May 2015 14:37:18 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 50970 invoked by uid 500); 17 May 2015 14:37:18 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 50960 invoked by uid 99); 17 May 2015 14:37:18 -0000 Received: from Unknown (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 17 May 2015 14:37:18 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 063ACC0B1C for ; Sun, 17 May 2015 14:37:18 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.879 X-Spam-Level: ** X-Spam-Status: No, score=2.879 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-us-west.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id e9fgEBMUKHRn for ; Sun, 17 May 2015 14:37:16 +0000 (UTC) Received: from mail-oi0-f45.google.com (mail-oi0-f45.google.com [209.85.218.45]) by mx1-us-west.apache.org (ASF Mail Server at mx1-us-west.apache.org) with ESMTPS id 206E924BB6 for ; Sun, 17 May 2015 14:37:16 +0000 (UTC) Received: by oift201 with SMTP id t201so112467144oif.3 for ; Sun, 17 May 2015 07:37:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type; bh=9o9LZXlkCzxBd1T98VNrgB2ZNFL7FxJMwxPgDdEoYIM=; b=nTxT78jLb4hHdxJpU1QTP5wQnq8xpXcmMDGL8osXn7gl8j7bi+j47KGGH4mwxbnlpT T/oJsGPpD+JHDqVzPO+/2rE/ecDpqfaex3VKNDl1LD4+WklZZCh8X5xczteSMnTcO+CJ Z+S1aDIy/14xWGGCAdNx+7A4LLZVGxa4pLlw6sSDCZVbn/s545i8/wzH05WCiaqixYsa 4ftEGpxl0oBsYBLdKjEmBHH1MD+a2VfrmKGnvZtigg5UbzF39SA1snCfs3OY4zkqyrez CxurxFpQZCZS84zVJWaAxQm2zeBl3D7axIx7zEZl7Uho6wxlEA92c8w6+C44tYeCtemi urZw== MIME-Version: 1.0 X-Received: by 10.182.186.4 with SMTP id fg4mr12405350obc.7.1431873429390; Sun, 17 May 2015 07:37:09 -0700 (PDT) Received: by 10.76.34.8 with HTTP; Sun, 17 May 2015 07:37:09 -0700 (PDT) In-Reply-To: References: Date: Sun, 17 May 2015 10:37:09 -0400 Message-ID: Subject: Re: How to set mapreduce.input.fileinputformat.split.maxsize for a specific job From: Shahab Yunus To: "user@hadoop.apache.org" Content-Type: multipart/alternative; boundary=089e013cbeac1f4c5d0516480389 --089e013cbeac1f4c5d0516480389 Content-Type: text/plain; charset=UTF-8 What do you think is the type of the property value that you are trying to write? Is it string? Or numeric? Have you check the documentation of the Configuration class that I sent earlier? There are multiple setXXX methods depending on the type of the property value being set: https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html#setLong(java.lang.String, long) For the other case below, why are you setting the job object (first parameter) as null? FileInputFormat.setMaxInputSplitSize(null, 102400); Check out the documentation here: http://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/mapreduce/lib/input/FileInputFormat.html#setMaxInputSplitSize(org.apache.hadoop.mapreduce.Job, long) Lastly, conf.set("mapreduce.input.fileinputformat.split.maxsize", "102400"); VS. job.getConfiguration().set("mapreduce.input.fileinputformat.split.maxsize", "102400"); is just a matter of how you are referencing the configuration object. Either as its own reference of through chained called from the job object. That is programming style decision and has no bearing on it. Regards, Shahab On Sun, May 17, 2015 at 10:17 AM, Answer Agrawal wrote: > Thanks, > Is this the correct way to write ? > conf.set("mapreduce.input.fileinputformat.split.maxsize", "102400"); > or > job.getConfiguration().set("mapreduce.input.fileinputformat.split.maxsize", > "102400"); > > I think another ways as > FileInputFormat.setMaxInputSplitSize(null, 102400); > > Is this all right ? Are these both solve the same purpose or something > else ? > > Thanks, > > On Sat, May 16, 2015 at 8:48 PM, Shahab Yunus > wrote: > >> You can either pass them on as command line argument using -D option. >> Assuming your job is implementing the standard Tool interface: >> >> https://hadoop.apache.org/docs/current/api/org/apache/hadoop/util/Tool.html >> >> Or you can set them in the code using the various 'set' methods to set >> key/value values in the configuration object. >> >> ... >> Job job = Job.getInstance(getConf()); >> job.setJarByClass(MyJob.class); >> >> job.getConfiguration().set("",); >> .... >> >> Docs for Configuration class: >> https://hadoop.apache.org/docs/current/api/org/apache/hadoop/conf/Configuration.html >> >> This will work as long as the property is not marked final >> >> Regards, >> Shahab >> >> >> On Sat, May 16, 2015 at 10:49 AM, Answer Agrawal >> wrote: >> >>> Hi, >>> >>> In xmls configuration file of Hadoop-2.x, >>> "mapreduce.input.fileinputformat.split.minsize" is given which can be set >>> but how to set "mapreduce.input.fileinputformat.split.maxsize" in xml file. >>> I need to set it in my mapreduce code. >>> >>> Thanks, >>> >> >> > --089e013cbeac1f4c5d0516480389 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
What do you think is =C2=A0the type of the property value = that you are trying to write? Is it string? Or numeric? Have you check the = documentation of the Configuration class that I sent earlier?

There are multiple setXXX methods depending on the type of the proper= ty value being set:

Lastly,
=C2=A0= conf.set("mapreduce.input.fileinputformat.sp= lit.maxsize", "102400");
VS.
job.ge= tConfiguration().set("mapreduce.input.fileinputformat.split.maxsize&qu= ot;, "102400");
is just a matter of how you are referencing the=C2=A0configuration=C2=A0ob= ject. Either as its own reference of through chained called from the job ob= ject. That is programming style decision and has no bearing on it.

Rega= rds,
Shahab

On Sun, May 17, 2015 at 10:17 AM, Answer Agrawal <yr= sna.tset01@gmail.com> wrote:
Thanks,=C2=A0
Is this the cor= rect way to write ?
conf.set(&qu= ot;mapreduce.input.fileinputformat.split.maxsize", "102400")= ;
or
job.getConfiguration().set("mapreduce.input.fileinputf= ormat.split.maxsize", "102400");

I think another ways as=
FileInputFormat.setMaxInputSplitSize(n= ull, 102400);

Is this all right ? Are these both solve the same purpose or som= ething else ?

Thanks,

On Sat, May 16, 2015 at 8:48 PM, Shaha= b Yunus <shahab.yunus@gmail.com> wrote:
You can either pass them on as command = line argument using -D option. Assuming your job is implementing the standa= rd Tool interface:
https://hadoop.apache= .org/docs/current/api/org/apache/hadoop/util/Tool.html

Or you can set them in the code using the various 'set' meth= ods to set key/value values in the configuration object.

...
Job job =3D Job.getInstance(getConf());
job.setJarByClass(MyJob.class);

job.getConfigurat= ion().set("<property-name>",<value>);
....<= /div>


This will wo= rk as long as the property is not marked final

Regards,
Shahab


On Sat, May 16, 2015 = at 10:49 AM, Answer Agrawal <yrsna.tset01@gmail.com> wr= ote:
Hi,=C2=A0

=
In xmls configuration file of Had= oop-2.x, "mapreduce.input.fileinputformat.split.minsize" is given= which can be set but how to set "mapreduce.input.fileinputformat.spli= t.maxsize" in xml file. I need to set it in my mapreduce code.
=

Thanks,



--089e013cbeac1f4c5d0516480389--