flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Tzoumas <ktzou...@apache.org>
Subject Re: Flink ProgramDriver
Date Sat, 22 Nov 2014 17:11:01 GMT
Are you looking for something like
https://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/util/ProgramDriver.html
?

You should be able to use the Hadoop ProgramDriver directly, see for
example here:
https://github.com/ktzoumas/incubator-flink/blob/tez_support/flink-addons/flink-tez/src/main/java/org/apache/flink/tez/examples/ExampleDriver.java

If you don't want to introduce a Hadoop dependency in your project, you can
just copy-paste ProgramDriver, it does not have any dependencies to Hadoop
classes. That class just accumulates <String,Class> pairs (simplifying a
bit) and calls the main method of the corresponding class.

On Sat, Nov 22, 2014 at 5:34 PM, Stephan Ewen <sewen@apache.org> wrote:

> Not sure I get exactly what this is, but packaging multiple examples in
> one program is well possible. You can have arbitrary control flow in the
> main() method.
>
> Should be well possible to do something like that hadoop examples setup...
>
> On Fri, Nov 21, 2014 at 7:02 PM, Flavio Pompermaier <pompermaier@okkam.it>
> wrote:
>
>> That was something I used to do with hadoop and it's comfortable when
>> testing stuff (so it is not so important).
>> For an example see what happens when you run the old "hadoop jar
>> hadoop-mapreduce-examples.jar" command..it "drives" you to the correct
>> invokation of that job.
>> However, the important thing is that I'd like to keep existing related
>> jobs somewhere (like a repository of jobs), deploy them and then be able to
>> start the one I need from an external program.
>>
>> Could this be done with RemoteExecutor? Or is there any WS to manage the
>> job execution? That would be very useful..
>> Is the Client interface the only one that allow something similar right
>> now?
>>
>> On Fri, Nov 21, 2014 at 6:19 PM, Stephan Ewen <sewen@apache.org> wrote:
>>
>>> I am not sure exactly what you need there. In Flink you can write more
>>> than one program in the same program ;-) You can define complex flows and
>>> execute arbitrarily at intermediate points:
>>>
>>> main() {
>>>   ExecutionEnvironment env = ...;
>>>
>>>   env.readSomething().map().join(...).and().so().on();
>>>   env.execute();
>>>
>>>   env.readTheNextThing().do()Something();
>>>   env.execute();
>>> }
>>>
>>>
>>> You can also just "save" a program and keep it for later execution:
>>>
>>> Plan plan = env.createProgramPlan();
>>>
>>> at a later point you can start that plan: new RemoteExecutor(master,
>>> 6123).execute(plan);
>>>
>>>
>>>
>>> Stephan
>>>
>>>
>>>
>>> On Fri, Nov 21, 2014 at 5:49 PM, Flavio Pompermaier <
>>> pompermaier@okkam.it> wrote:
>>>
>>>> Any help on this? :(
>>>>
>>>> On Fri, Nov 21, 2014 at 9:33 AM, Flavio Pompermaier <
>>>> pompermaier@okkam.it> wrote:
>>>>
>>>>> Hi guys,
>>>>> I forgot to ask you if there's a Flink utility to simulate the Hadoop
>>>>> ProgramDriver class that acts somehow like a registry of jobs. Is there
>>>>> something similar?
>>>>>
>>>>> Best,
>>>>> Flavio
>>>>>
>>>>
>>
>

Mime
View raw message