hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Varad Meru <meru.va...@gmail.com>
Subject Re: Passing Command-line Parameters to the Job Submit Command
Date Wed, 26 Sep 2012 16:46:42 GMT
Thanks Hemanth,

Yes, the java variables passed as -Dkey=value. But for the arguments passed to the main method
(i.e. String[] args) I cannot find any other way to pass them apart from hadoop jar CLASSNAME
arguments. So if I have a job file, I'll will compulsorily have to use the java variables,
and not the command line arguments.


On 25-Sep-2012, at 12:40 PM, Hemanth Yamijala wrote:

> By java environment variables, do you mean the ones passed as
> -Dkey=value ? That's one way of passing them. I suppose another way is
> to have a client side site configuration (like mapred-site.xml) that
> is in the classpath of the client app.
> Thanks
> Hemanth
> On Tue, Sep 25, 2012 at 12:20 AM, Varad Meru <meru.varad@gmail.com> wrote:
>> Thanks Hemanth,
>> But in general, if we want to pass arguments to any job (not only
>> PiEstimator from examples-jar) and submit the Job to the Job queue
>> scheduler, by the looks of it, we might always need to use the java
>> environment variables only.
>> Is my above assumption correct?
>> Thanks,
>> Varad
>> On Mon, Sep 24, 2012 at 9:48 AM, Hemanth Yamijala <yhemanth@gmail.com>wrote:
>>> Varad,
>>> Looking at the code for the PiEstimator class which implements the
>>> 'pi' example, the two arguments are mandatory and are used *before*
>>> the job is submitted for execution - i.e on the client side. In
>>> particular, one of them (nSamples) is used not by the MapReduce job,
>>> but by the client code (i.e. PiEstimator) to generate some input.
>>> Hence, I believe all of this additional work that is being done by the
>>> PiEstimator class will be bypassed if we directly use the job -submit
>>> command. In other words, I don't think these two ways of running the
>>> job:
>>> - using the "hadoop jar examples pi"
>>> - using hadoop job -submit
>>> are equivalent.
>>> As a general answer to your question though, if additional parameters
>>> are used by the Mappers or reducers, then they will generally be set
>>> as additional job specific configuration items. So, one way of using
>>> them with the job -submit command will be to find out the specific
>>> names of the configuration items (from code, or some other
>>> documentation), and include them in the job.xml used when submitting
>>> the job.
>>> Thanks
>>> Hemanth
>>> On Sun, Sep 23, 2012 at 1:24 PM, Varad Meru <meru.varad@gmail.com> wrote:
>>>> Hi,
>>>> I want to run the PiEstimator example from using the following command
>>>> $hadoop job -submit pieestimatorconf.xml
>>>> which contains all the info required by hadoop to run the job. E.g. the
>>>> input file location, the output file location and other details.
>>> <property><name>mapred.jar</name><value>file:////Users/varadmeru/Work/Hadoop/hadoop-examples-1.0.3.jar</value></property>
>>>> <property><name>mapred.map.tasks</name><value>20</value></property>
>>>> <property><name>mapred.reduce.tasks</name><value>2</value></property>
>>>> ...
>>>> <property><name>mapred.job.name
>>> </name><value>PiEstimator</value></property>
>>> <property><name>mapred.output.dir</name><value>file:////Users/varadmeru/Work/out</value></property>
>>>> Now, as we now, to run the PiEstimator, we can use the following command
>>> too
>>>> $hadoop jar hadoop-examples.1.0.3 pi 5 10
>>>> where 5 and 10 are the arguments to the main class of the PiEstimator.
>>> How
>>>> can I pass the same arguments (5 and 10) using the job -submit command
>>>> through conf. file or any other way, without changing the code of the
>>>> examples to reflect the use of environment variables.
>>>> Thanks in advance,
>>>> Varad
>>>> -----------------
>>>> Varad Meru
>>>> Software Engineer,
>>>> Business Intelligence and Analytics,
>>>> Persistent Systems and Solutions Ltd.,
>>>> Pune, India.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message