hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alejandro Abdelnur <t...@cloudera.com>
Subject Re: running hadoop jobs from within a program
Date Fri, 12 Nov 2010 09:30:42 GMT

You should a look at Oozie, it will allow you to do what you describe.

You can either build Oozie from https://github.com/yahoo/oozie or
download CDH3b3 distribution from http://www.cloudera.com/downloads/
(Oozie is preconfigured to work with CHD3b3 Hadoop).

Hope this helps.


On Fri, Nov 12, 2010 at 12:55 AM, daniel sikar <dsikar@gmail.com> wrote:
> I suggest you write a loop in your bash script, grepping for finished,
> then take it from there.
> Also, you can submit the same job as many times as you like.
> On 12 November 2010 02:17, web service <wbsrvc@gmail.com> wrote:
>> Hi,
>>  Currently I run my sample hadoop job from a bash script using the
>> following command ...
>> [code]
>> tmp="$HADOOP_BIN jar $JAR_LOC  $MAIN_CLASS /user/joe/input/input-$i/
>> /user/vadmin/output/output-$i/
>> $tmp
>> [/code]
>> However, I would want to write a timer that would do some cleanup after the
>> jobs are  complete and restart the jobs after x hours. What I am looking for
>> is
>> the ability to invoke job from within a program and not the jar command
>> thing.
>> -Mac

View raw message