flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <mj...@informatik.hu-berlin.de>
Subject Re: Package multiple jobs in a single jar
Date Tue, 19 May 2015 08:10:37 GMT
Supporting an interface like this seems to be a nice idea. Any other
opinions on it?

It seems to be some more work to get it done right. I don't want to
start working on it, before it's clear that it has a chance to be
included in Flink.

@Flavio: I moved the discussion to dev mailing list (user list is not
appropriate for this discussion). Are you subscribed to it or should I
cc you in each mail?


-Matthias


On 05/19/2015 09:39 AM, Flavio Pompermaier wrote:
> Nice feature Matthias!
> My suggestion is to create a specific Flink interface to get also
> description of a job and standardize parameter passing.
> Then, somewhere (e.g. Manifest) you could specify the list of packages (or
> also directly the classes) to inspect with reflection to extract the list
> of available Flink jobs.
> Something like:
> 
> public interface FlinkJob {
> 
> /** The name to display in the job submission UI or shell */
> //e.g. "My Flink HelloWorld"
> String getDisplayName();
>  //e.g. "This program does this and that etc.."
> String getDescription();
>  //e.g. <0,Integer,"An integer representing my first param">, <1,String,"An
> string representing my second param">
> List<Tuple3<Integer, TypeInfo, String>> paramDescription;
>  /** Set up the flink job in the passed ExecutionEnvironment */
> ExecutionEnvironment config(ExecutionEnvironment env);
> }
> 
> What do you think?
> 
> 
> 
> On Sun, May 17, 2015 at 10:38 PM, Matthias J. Sax <
> mjsax@informatik.hu-berlin.de> wrote:
> 
>> Hi,
>>
>> I like the idea that Flink's WebClient can show different plans for
>> different jobs within a single jar file.
>>
>> I prepared a prototype for this feature. You can find it here:
>> https://github.com/mjsax/flink/tree/multipleJobsWebUI
>>
>> To test the feature, you need to prepare a jar file, that contains the
>> code of multiple programs and specify each entry class in the manifest
>> file as comma separated values in "program-class" line.
>>
>> Feedback is welcome. :)
>>
>>
>> -Matthias
>>
>>
>> On 05/08/2015 03:08 PM, Flavio Pompermaier wrote:
>>> Thank you all for the support!
>>> It will be a really nice feature if the web client could be able to show
>>> me the list of Flink jobs within my jar..
>>> it should be sufficient to mark them with a special annotation and
>>> inspect the classes within the jar..
>>>
>>> On Fri, May 8, 2015 at 3:03 PM, Malte Schwarzer <ms@mieo.de
>>> <mailto:ms@mieo.de>> wrote:
>>>
>>>     Hi Flavio,
>>>
>>>     you also can put each job in a single class and use the –c parameter
>>>     to execute jobs separately:
>>>
>>>     /bin/flink run –c com.myflinkjobs.JobA /path/to/jar/multiplejobs.jar
>>>     /bin/flink run –c com.myflinkjobs.JobB /path/to/jar/multiplejobs.jar
>>>     …
>>>
>>>     Cheers
>>>     Malte
>>>
>>>     Von: Robert Metzger <rmetzger@apache.org <mailto:rmetzger@apache.org
>>>>
>>>     Antworten an: <user@flink.apache.org <mailto:user@flink.apache.org>>
>>>     Datum: Freitag, 8. Mai 2015 14:57
>>>     An: "user@flink.apache.org <mailto:user@flink.apache.org>"
>>>     <user@flink.apache.org <mailto:user@flink.apache.org>>
>>>     Betreff: Re: Package multiple jobs in a single jar
>>>
>>>     Hi Flavio,
>>>
>>>     the pom from our quickstart is a good
>>>     reference:
>> https://github.com/apache/flink/blob/master/flink-quickstart/flink-quickstart-java/src/main/resources/archetype-resources/pom.xml
>>>
>>>
>>>
>>>
>>>     On Fri, May 8, 2015 at 2:53 PM, Flavio Pompermaier
>>>     <pompermaier@okkam.it <mailto:pompermaier@okkam.it>> wrote:
>>>
>>>         Ok, get it.
>>>         And is there a reference pom.xml for shading my application into
>>>         one fat-jar? which flink dependencies can I exclude?
>>>
>>>         On Fri, May 8, 2015 at 1:05 PM, Fabian Hueske <fhueske@gmail.com
>>>         <mailto:fhueske@gmail.com>> wrote:
>>>
>>>             I didn't say that the main should return the
>>>             ExecutionEnvironment.
>>>             You can define and execute as many programs in a main
>>>             function as you like.
>>>             The program can be defined somewhere else, e.g., in a
>>>             function that receives an ExecutionEnvironment and attaches
>>>             a program such as
>>>
>>>             public void buildMyProgram(ExecutionEnvironment env) {
>>>               DataSet<String> lines = env.readTextFile(...);
>>>               // do something
>>>               lines.writeAsText(...);
>>>             }
>>>
>>>             That method could be invoked from main():
>>>
>>>             psv main() {
>>>               ExecutionEnv env = ...
>>>
>>>               if(...) {
>>>                 buildMyProgram(env);
>>>               }
>>>               else {
>>>                 buildSomeOtherProg(env);
>>>               }
>>>
>>>               env.execute();
>>>
>>>               // run some more programs
>>>             }
>>>
>>>             2015-05-08 12:56 GMT+02:00 Flavio Pompermaier
>>>             <pompermaier@okkam.it <mailto:pompermaier@okkam.it>>:
>>>
>>>                 Hi Fabian,
>>>                 thanks for the response.
>>>                 So my mains should be converted in a method returning
>>>                 the ExecutionEnvironment.
>>>                 However it think that it will be very nice to have a
>>>                 syntax like the one of the Hadoop ProgramDriver to
>>>                 define jobs to invoke from a single root class.
>>>                 Do you think it could be useful?
>>>
>>>                 On Fri, May 8, 2015 at 12:42 PM, Fabian Hueske
>>>                 <fhueske@gmail.com <mailto:fhueske@gmail.com>> wrote:
>>>
>>>                     You easily have multiple Flink programs in a single
>>>                     JAR file.
>>>                     A program is defined using an ExecutionEnvironment
>>>                     and executed when you call
>>>                     ExecutionEnvironment.exeucte().
>>>                     Where and how you do that does not matter.
>>>
>>>                     You can for example implement a main function such
>> as:
>>>
>>>                     public static void main(String... args) {
>>>
>>>                       if (today == Monday) {
>>>                         ExecutionEnvironment env = ...
>>>                         // define Monday prog
>>>                         env.execute()
>>>                       }
>>>                       else {
>>>                         ExecutionEnvironment env = ...
>>>                         // define other prog
>>>                         env.execute()
>>>                       }
>>>                     }
>>>
>>>                     2015-05-08 11:41 GMT+02:00 Flavio Pompermaier
>>>                     <pompermaier@okkam.it <mailto:pompermaier@okkam.it
>>>> :
>>>
>>>                         Hi to all,
>>>                         is there any way to keep multiple jobs in a jar
>>>                         and then choose at runtime the one to execute
>>>                         (like what ProgramDriver does in Hadoop)?
>>>
>>>                         Best,
>>>                         Flavio
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
> 


Mime
View raw message