hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Harsh J <ha...@cloudera.com>
Subject Re: execute hadoop job from remote web application
Date Tue, 18 Oct 2011 16:13:55 GMT
Oleg,

Steve already covered this.

The "hadoop jar" subcommand merely runs the jar program for you, as a
utility - it has nothing to do with submissions really.

Have you tried submitting your program by running your jar as a
regular java program (java -jar <jar>), with the proper classpath?
(You may use "hadoop classpath" to get a string.).

It would go through fine, and submit the job jar with classes
included, over to the JobTracker.

On Tue, Oct 18, 2011 at 9:13 PM, Oleg Ruchovets <oruchovets@gmail.com> wrote:
> I  try to be more specific. It is not dependent jar. It is a jar which
> contains map/reduce/combine classes and some business logic.
>  executing our job from command line, class which parse parameters and
> submit a job has a line of code:
>    job.setJarByClass(HadoopJobExecutor.class);
>
> we execute it locally on hadoop master machine using command such command:
> opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar
> -inputPath /opt/inputs/  -outputPath /data/output_jobs/output
>
> and of course my_hadoop_job.jar  is found because it is located on the same
> machine.
>
> Now , suppose I am going to submit job remotely (from web applications).
>  and I have the same line of code
> job.setJarByClass(HadoopJobExecutor.class);
>
>  In case my_hadoop_job.jar located on remote hadoop machine  (in class path)
> , my jobClient will failed because there is no job jar in class path ( it is
> located on remote hadoop machine). Am I write? I simply don't know how to
> submit a job remotely (in my case job is not a map/combine/reduce classes it
> is a jar which contains other classes too).
>
> Regarding remotely invoke the shellscript that contains the hadoop jar
> command with
> any required input arguments.
>    It is possible to do it  by Runtime.getRuntime().exec(
> submitCommand.toString().split( " " ) );
> But I prefer to use jobClient , because I can monitor my job (get counters
> and other useful information).
>
> Thanks in advance
> Oleg.
>
> On Tue, Oct 18, 2011 at 4:34 PM, Bejoy KS <bejoy.hadoop@gmail.com> wrote:
>
>> Hi Oleg
>>          I haven't tried out a scenario like you mentioned. But I think
>> there shouldn't be any issue in submitting a job that has some dependent
>> classes which holds the business logic referred from mapper,reducer or
>> combiner. You should be able to do the job submission remotely the same we
>> were discussing in this thread. If you need to distribute any dependent
>> jars/files along with the application jar, you can use the -libjars option
>> in CLI or use the DistributedCache methods like
>> addArchiveToClassPath()/addFileToClassPath() in your java code. If it is a
>> dependent jar It is better to deploy the same in the cluster environment
>> itself so that every time when you submit your job you don't have to
>> transfer the jar over the network again and again.
>>         Just a suggestion, if you can execute the job from within your
>> hadoop cluster you don't have to do a remote job submission. You just need
>> to remotely invoke the shellscript that contains the hadoop jar command
>> with
>> any required input arguments. Sorry if I'm not getting your requirement
>> exactly.
>>
>> Regards
>> Bejoy.K.S
>>
>> On Tue, Oct 18, 2011 at 6:29 PM, Oleg Ruchovets <oruchovets@gmail.com
>> >wrote:
>>
>> > Thanks  you all for your answers but I still have a questions:
>> >  Currently we running our jobs using shell scripts which locates on
>> hadoop
>> > master machine.
>> >
>> > Here is an example of command line:
>> > /opt/hadoop/bin/hadoop jar /opt/hadoop/hadoop-jobs/my_hadoop_job.jar
>> > -inputPath /opt/inputs/  -outputPath /data/output_jobs/output
>> >
>> > my_hadoop_job.jar has a class which parse input parameters and submit a
>> > job.
>> > Our code is very similar like you wrote:
>> >   ......
>> >
>> >        job.setJarByClass(HadoopJobExecutor.class);
>> >        job.setMapperClass(MultipleOutputMap.class);
>> >        job.setCombinerClass(BaseCombine.class);
>> >        job.setReducerClass(HBaseReducer.class);
>> >        job.setOutputKeyClass(Text.class);
>> >        job.setOutputValueClass(MapWritable.class);
>> >
>> >        FileOutputFormat.setOutputPath(job, new Path(finalOutPutPath));
>> >
>> >        jobCompleteStatus = job.waitForCompletion(true);
>> > ...............
>> >
>> > my question are:
>> >
>> > 1) my_hadoop_job.jar contains another classes (business logic) not only
>> > Map,Combine,Reduce classes and I still don't understand how can I submit
>> > job
>> > which needs all classes from my_hadoop_job.jar?
>> > 2) Do I need to submit a my_hadoop_job.jar too? If yes what is the way to
>> > do
>> > it?
>> >
>> > Thanks In Advance
>> > Oleg.
>> >
>> > On Tue, Oct 18, 2011 at 2:11 PM, Uma Maheswara Rao G 72686 <
>> > maheswara@huawei.com> wrote:
>> >
>> > > ----- Original Message -----
>> > > From: Bejoy KS <bejoy.hadoop@gmail.com>
>> > > Date: Tuesday, October 18, 2011 5:25 pm
>> > > Subject: Re: execute hadoop job from remote web application
>> > > To: common-user@hadoop.apache.org
>> > >
>> > > > Oleg
>> > > >      If you are looking at how to submit your jobs using
>> > > > JobClient then the
>> > > > below sample can give you a start.
>> > > >
>> > > > //get the configuration parameters and assigns a job name
>> > > >        JobConf conf = new JobConf(getConf(), MyClass.class);
>> > > >        conf.setJobName("SMS Reports");
>> > > >
>> > > >        //setting key value types for mapper and reducer outputs
>> > > >        conf.setOutputKeyClass(Text.class);
>> > > >        conf.setOutputValueClass(Text.class);
>> > > >
>> > > >        //specifying the custom reducer class
>> > > >        conf.setReducerClass(SmsReducer.class);
>> > > >
>> > > >        //Specifying the input directories(@ runtime) and Mappers
>> > > > independently for inputs from multiple sources
>> > > >        FileInputFormat.addInputPath(conf, new Path(args[0]));
>> > > >
>> > > >        //Specifying the output directory @ runtime
>> > > >        FileOutputFormat.setOutputPath(conf, new Path(args[1]));
>> > > >
>> > > >        JobClient.runJob(conf);
>> > > >
>> > > > Along with the hadoop jars you may need to have the config files
>> > > > as well on
>> > > > your client.
>> > > >
>> > > > The sample is from old map reduce API. You can use the new one as
>> > > > well in
>> > > > that we use the Job instead of JobClient.
>> > > >
>> > > > Hope it helps!..
>> > > >
>> > > > Regards
>> > > > Bejoy.K.S
>> > > >
>> > > >
>> > > > On Tue, Oct 18, 2011 at 5:00 PM, Oleg Ruchovets
>> > > > <oruchovets@gmail.com>wrote:
>> > > > > Excellent. Can you give a small example of code.
>> > > > >
>> > > Good samle by Bejoy
>> > > hope, you have access for this site.
>> > > Also please go through this docs,
>> > >
>> > >
>> >
>> http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html#Example%3A+WordCount+v2.0
>> > > Here is the wordcount example.
>> > >
>> > > > >
>> > > > > On Tue, Oct 18, 2011 at 1:13 PM, Uma Maheswara Rao G 72686 <
>> > > > > maheswara@huawei.com> wrote:
>> > > > >
>> > > > > >
>> > > > > > ----- Original Message -----
>> > > > > > From: Oleg Ruchovets <oruchovets@gmail.com>
>> > > > > > Date: Tuesday, October 18, 2011 4:11 pm
>> > > > > > Subject: execute hadoop job from remote web application
>> > > > > > To: common-user@hadoop.apache.org
>> > > > > >
>> > > > > > > Hi , what is the way to execute hadoop job on remote
>> > > > cluster. I
>> > > > > > > want to
>> > > > > > > execute my hadoop job from remote web  application
, but I
>> > > > didn't> > > find any
>> > > > > > > hadoop client (remote API) to do it.
>> > > > > > >
>> > > > > > > Please advice.
>> > > > > > > Oleg
>> > > > > > >
>> > > > > > You can put the Hadoop jars in your web applications classpath
>> > > > and find
>> > > > > the
>> > > > > > Class JobClient and submit the jobs using it.
>> > > > > >
>> > > > > > Regards,
>> > > > > > Uma
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > Regards
>> > > Uma
>> > >
>> >
>>
>



-- 
Harsh J

Mime
View raw message