Return-Path: X-Original-To: apmail-flink-dev-archive@www.apache.org Delivered-To: apmail-flink-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id E55F118D1C for ; Tue, 26 May 2015 15:11:22 +0000 (UTC) Received: (qmail 14448 invoked by uid 500); 26 May 2015 15:11:22 -0000 Delivered-To: apmail-flink-dev-archive@flink.apache.org Received: (qmail 14396 invoked by uid 500); 26 May 2015 15:11:22 -0000 Mailing-List: contact dev-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@flink.apache.org Delivered-To: mailing list dev@flink.apache.org Received: (qmail 14324 invoked by uid 99); 26 May 2015 15:11:22 -0000 Received: from Unknown (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 May 2015 15:11:22 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id DA766C8C6E for ; Tue, 26 May 2015 15:11:21 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 4.001 X-Spam-Level: **** X-Spam-Status: No, score=4.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=3, KAM_LAZY_DOMAIN_SECURITY=1, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-us-east.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id XoVhkdC8gCKr for ; Tue, 26 May 2015 15:11:08 +0000 (UTC) Received: from mail-yh0-f44.google.com (mail-yh0-f44.google.com [209.85.213.44]) by mx1-us-east.apache.org (ASF Mail Server at mx1-us-east.apache.org) with ESMTPS id D25BE45445 for ; Tue, 26 May 2015 15:11:07 +0000 (UTC) Received: by yhcb70 with SMTP id b70so31519419yhc.0 for ; Tue, 26 May 2015 08:11:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=+D3KqxXkGOwLtt0bxm6XxXlbsn6J4Z7PESZd8oagRUI=; b=NkeBTDsIlQ+cr8KegdgPYmwPxmzc/G16EADicmYDc+9ouo9kJmRPH2am29G8aexNAq Ld8fceMBXUf3BXjinKza1XrpyOVMu2B8KcLVIBXmngdyM1LPA+++enDHz5lbZYXq4ebc DUx1O/Hruqf3v7ELSiv718h8+HqGjpxIs+IeLh7dh2vDn4GAxDhLS382yKlwWSHmAyZJ HdBQZS4OJawBHlDMtKQL6DDLMBpbC1YeY0mf2qMzMFjUc8hqWi4J4FJI3MgCf01Knu5e OFAZ7U01vQU3rXccidpbJGn2iwDiQPDhy1YY4YHbqiAcQyl6qL1xXeJyvh0V9MRdutDR W+Ew== X-Gm-Message-State: ALoCoQkNiODz8Mk5tEwhaZ/VqDLlLYCtyk9SDI5tDjv/dcNuUKTqgMBD4iogVoHbtHmCaNVLBEpO X-Received: by 10.170.165.130 with SMTP id h124mr27123181ykd.23.1432653067437; Tue, 26 May 2015 08:11:07 -0700 (PDT) MIME-Version: 1.0 Received: by 10.13.220.135 with HTTP; Tue, 26 May 2015 08:10:47 -0700 (PDT) X-Originating-IP: [213.203.177.29] In-Reply-To: <55648AD5.90702@informatik.hu-berlin.de> References: <5558FC47.6060503@informatik.hu-berlin.de> <555AEFFD.4000806@informatik.hu-berlin.de> <555DE11F.4010508@informatik.hu-berlin.de> <555EFFB2.7010706@informatik.hu-berlin.de> <555F7E4D.3050007@informatik.hu-berlin.de> <555F8CEF.7000007@informatik.hu-berlin.de> <55644457.5070906@informatik.hu-berlin.de> <55648AD5.90702@informatik.hu-berlin.de> From: Flavio Pompermaier Date: Tue, 26 May 2015 17:10:47 +0200 Message-ID: Subject: Re: Package multiple jobs in a single jar To: dev@flink.apache.org Content-Type: multipart/alternative; boundary=001a113a95f42bee7c0516fd8901 --001a113a95f42bee7c0516fd8901 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable I agree with Matthias,I didn't know about ProgramDesciption and Program Interfaces because they are not advertised anywhere.. On Tue, May 26, 2015 at 5:01 PM, Matthias J. Sax < mjsax@informatik.hu-berlin.de> wrote: > I see your point. > > However, right now only few people are aware of "ProgramDesciption" > interface. If we want to "advertise" for it, it should be used (at > least) in a few examples. Otherwise, people will never use it, and the > changes I plan to apply are kind of useless. I would even claim, that > the interface should be removed completely is this case... > > > On 05/26/2015 03:31 PM, Maximilian Michels wrote: > > Sorry, my bad. Yes, it is helpful to have a separate program and > parameter > > description in ProgramDescription. I'm not sure if it adds much value t= o > > implement ProgramDescription in the examples. It introduces verbosity a= nd > > might give the impression that you have to implement ProgramDescription > in > > your Flink job. > > > > On Tue, May 26, 2015 at 12:00 PM, Matthias J. Sax < > > mjsax@informatik.hu-berlin.de> wrote: > > > >> Hi Max, > >> > >> thanks for your feedback. I guess you confuse the interfaces "Program" > >> and "ProgramDescription". Using "Program" the use of main method is > >> replaced by "getPlan(...)". However, "ProgramDescription" only adds > >> method "getDescription()" which returns a string that explains the usa= ge > >> of the program (ie, short description, expected parameters). > >> > >> Thus, adding "ProgramDescription" to the examples, does not change the > >> examples -- main method will still be uses. It only adds the ability > >> that a program "explains" itself (ie, give meta info). Furhtermore, > >> "ProgramDescription" is also not related to the new "ParameterTool". > >> > >> -Matthias > >> > >> On 05/26/2015 11:46 AM, Maximilian Michels wrote: > >>> I don't think `getDisplayName()` is necessary either. The class name > and > >>> the description string should be fine. Adding ProgramDescription to t= he > >>> examples is not necessary; as already pointed out, using the main > method > >> is > >>> more convenient for most users. As far as I know, the idea of the > >>> ParameterTool was to use it only in the user code and not automatical= ly > >>> handle parameters. > >>> > >>> Changing the interface would be quite API breaking but since most > >> programs > >>> use the main method, IMHO we could do it. > >>> > >>> On Fri, May 22, 2015 at 10:09 PM, Matthias J. Sax < > >>> mjsax@informatik.hu-berlin.de> wrote: > >>> > >>>> Makes sense to me. :) > >>>> > >>>> One more thing: What about extending the "ProgramDescription" > interface > >>>> to have multiple methods as Flavio suggested (with the config(...) > >>>> method that should be handle by the ParameterTool) > >>>> > >>>>> public interface FlinkJob { > >>>>> > >>>>> /** The name to display in the job submission UI or shell */ > >>>>> //e.g. "My Flink HelloWorld" > >>>>> String getDisplayName(); > >>>>> //e.g. "This program does this and that etc.." > >>>>> String getDescription(); > >>>>> //e.g. <0,Integer,"An integer representing my first param">, > >>>> <1,String,"An string representing my second param"> > >>>>> List> paramDescription; > >>>>> /** Set up the flink job in the passed ExecutionEnvironment */ > >>>>> ExecutionEnvironment config(ExecutionEnvironment env); > >>>>> } > >>>> > >>>> Right now, the interface is used only a couple of times in Flink's > code > >>>> base, so it would not be a problem to update those classes. However, > it > >>>> could break external code that uses the interface already (even if I > >>>> doubt that the interface is well known and used often [or at all]). > >>>> > >>>> I personally don't think, that "getDiplayName()" to too helpful. > >>>> Splitting the program description and the parameter description seem= s > to > >>>> be useful. For example, if wrong parameters are provided, the > parameter > >>>> description can be included in the error message. If program+paramet= er > >>>> description is given in a single string, this is not possible. But > this > >>>> is only a minor issue of course. > >>>> > >>>> Maybe, we should also add the interface to the current Flink example= s, > >>>> to make people more aware of it. Is there any documentation on the w= eb > >>>> site. > >>>> > >>>> > >>>> -Matthias > >>>> > >>>> > >>>> > >>>> On 05/22/2015 09:43 PM, Robert Metzger wrote: > >>>>> Thank you for working on this. > >>>>> My responses are inline below: > >>>>> > >>>>> (Flavio) > >>>>> > >>>>>> My suggestion is to create a specific Flink interface to get also > >>>>>> description of a job and standardize parameter passing. > >>>>> > >>>>> > >>>>> I've recently merged the ParameterTool which is solving the > >> "standardize > >>>>> parameter passing" problem (at least it presents a best practice) : > >>>>> > >>>> > >> > http://ci.apache.org/projects/flink/flink-docs-master/apis/best_practices= .html#parsing-command-line-arguments-and-passing-them-around-in-your-flink-= application > >>>>> > >>>>> Regarding the description: Maybe we can use the "ProgramDescription= " > >>>>> interface for getting a string describing the program in the web > >>>> frontend. > >>>>> > >>>>> (Matthias) > >>>>> > >>>>>> I don't want to start working on it, before it's clear that it has= a > >>>>>> chance to be > >>>>>> included in Flink. > >>>>> > >>>>> > >>>>> I think the changes discussed here won't change the current behavio= r, > >> but > >>>>> they add new functionality which > >>>>> can make the life of our users easier, so I'll vote to include your > >>>> changes > >>>>> (given they meet our quality standards) > >>>>> > >>>>> > >>>>> If multiple classes implement "Program" interface an exception shou= ld > >> be > >>>>>> through (I think that would make sense). However, I am not sure wa= s > >>>>>> "good" behavior is, if a single "Program"-class is found and an > >>>>>> additional main-method class. > >>>>>> - should "Program"-class be executed (ie, "overwrite" main-metho= d > >>>> class) > >>>>>> - or, better to through an exception ? > >>>>> > >>>>> > >>>>> I would give a class implementing "Program" priority over a random > >> main() > >>>>> method in a random class. > >>>>> Maybe printing a WARN log message informing the user that the > "Program" > >>>>> class has been choosen. > >>>>> > >>>>> > >>>>> If no "Program"-class is found, but a single main-method class, Fli= nk > >>>>>> could execute using main method. But I am not sure either, if this > is > >>>>>> "good" behavior. If multiple main-method classes are present, > throwing > >>>>>> and exception is the only way to got, I guess. > >>>>> > >>>>> > >>>>> I think the best effort approach "one class with main() found" is > good. > >>>> In > >>>>> case of multiple main methods, a helpful exception is the best > approach > >>>> in > >>>>> my opinion. > >>>>> > >>>>> > >>>>> If the manifest contains "program-class" or "Main-Class" entry, > >>>>>> should we check the jar file right away if the specified class is > >> there? > >>>>>> Right now, no check is performed and an error occurs if the user > tries > >>>>>> to execute the job. > >>>>> > >>>>> > >>>>> I'd say the current approach is sufficient. There is no need to hav= e > a > >>>>> special code path which is doing the check. > >>>>> I think the error message will be pretty similar in both cases and = I > >> fear > >>>>> that this additional code could also introduce new bugs ;) > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> On Fri, May 22, 2015 at 9:06 PM, Matthias J. Sax < > >>>>> mjsax@informatik.hu-berlin.de> wrote: > >>>>> > >>>>>> Hi, > >>>>>> > >>>>>> two more thoughts to this discussion: > >>>>>> > >>>>>> 1) looking at the commit history of "CliFrontend", I found the > >>>>>> following closed issue and the closing pull request > >>>>>> * https://issues.apache.org/jira/browse/FLINK-1095 > >>>>>> * https://github.com/apache/flink/pull/238 > >>>>>> It stand in opposite of Flavio's request to have a job description= . > >> Any > >>>>>> comment on this? Should a removed feature be re-introduced? If not= , > I > >>>>>> would suggest to remove the "ProgramDescription" interface > completely. > >>>>>> > >>>>>> 2) If the manifest contains "program-class" or "Main-Class" entry= , > >>>>>> should we check the jar file right away if the specified class is > >> there? > >>>>>> Right now, no check is performed and an error occurs if the user > tries > >>>>>> to execute the job. > >>>>>> > >>>>>> > >>>>>> -Matthias > >>>>>> > >>>>>> > >>>>>> On 05/22/2015 12:06 PM, Matthias J. Sax wrote: > >>>>>>> Thanks for your feedback. > >>>>>>> > >>>>>>> I agree on the main method "problem". For scanning and listing al= l > >>>> stuff > >>>>>>> that is found it's fine. > >>>>>>> > >>>>>>> The tricky question is the automatic invocation mechanism, if "-c= " > >> flag > >>>>>>> is not used, and no manifest program-class or Main-Class entry is > >>>> found. > >>>>>>> > >>>>>>> If multiple classes implement "Program" interface an exception > should > >>>> be > >>>>>>> through (I think that would make sense). However, I am not sure w= as > >>>>>>> "good" behavior is, if a single "Program"-class is found and an > >>>>>>> additional main-method class. > >>>>>>> - should "Program"-class be executed (ie, "overwrite" main-meth= od > >>>>>> class) > >>>>>>> - or, better to through an exception ? > >>>>>>> > >>>>>>> If no "Program"-class is found, but a single main-method class, > Flink > >>>>>>> could execute using main method. But I am not sure either, if thi= s > is > >>>>>>> "good" behavior. If multiple main-method classes are present, > >> throwing > >>>>>>> and exception is the only way to got, I guess. > >>>>>>> > >>>>>>> To sum up: Should Flink consider main-method classes for automati= c > >>>>>>> invocation, or should it be required for main-method classes to > >> either > >>>>>>> list them in "program-class" or "Main-Class" manifest parameter (= to > >>>>>>> enable them for automatic invocation)? > >>>>>>> > >>>>>>> > >>>>>>> -Matthias > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On 05/22/2015 09:56 AM, Maximilian Michels wrote: > >>>>>>>> Hi Matthias, > >>>>>>>> > >>>>>>>> Thank you for taking the time to analyze Flink's invocation > >> behavior. > >>>> I > >>>>>>>> like your proposal. I'm not sure whether it is a good idea to sc= an > >> the > >>>>>>>> entire JAR for main methods. Sometimes, main methods are added > >> solely > >>>>>> for > >>>>>>>> testing purposes and don't really serve any practical use. > However, > >> if > >>>>>>>> you're already going through the JAR to find the > ProgramDescription > >>>>>>>> interface, then you might look for main methods as well. As long > as > >> it > >>>>>> is > >>>>>>>> just a listing without execution, that should be fine. > >>>>>>>> > >>>>>>>> Best regards, > >>>>>>>> Max > >>>>>>>> > >>>>>>>> On Thu, May 21, 2015 at 3:43 PM, Matthias J. Sax < > >>>>>>>> mjsax@informatik.hu-berlin.de> wrote: > >>>>>>>> > >>>>>>>>> Hi, > >>>>>>>>> > >>>>>>>>> I had a look into the current Workflow of Flink with regard to > the > >>>>>>>>> progressing steps of a jar file. > >>>>>>>>> > >>>>>>>>> If I got it right it works as follows (not sure if this is > >> documented > >>>>>>>>> somewhere): > >>>>>>>>> > >>>>>>>>> 1) check, if "-c" flag is used to set program entry point > >>>>>>>>> if yes, goto 4 > >>>>>>>>> 2) try to extract "program-class" property from manifest > >>>>>>>>> (if found goto 4) > >>>>>>>>> 3) try to extract "Main-Class" property from manifest > >>>>>>>>> -> if not found through exception (this happens also, if no > >>>> manifest > >>>>>>>>> file is found at all) > >>>>>>>>> > >>>>>>>>> 4) check if entry point class implements "Program" interface > >>>>>>>>> if yes, goto 6 > >>>>>>>>> 5) check if entry point class provided "public static void > >>>>>> main(String[] > >>>>>>>>> args)" method > >>>>>>>>> -> if not, through exception > >>>>>>>>> > >>>>>>>>> 6) execute program (ie, show plan/info or really run it) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I also "discovered" the interface "ProgramDescription" with a > >> single > >>>>>>>>> method "String getDescription()". Even if some examples impleme= nt > >>>> this > >>>>>>>>> interface (and use it in the example itself), Flink basically > >> ignores > >>>>>>>>> it... From the CLI there is no way to get this info, and the > WebUI > >>>> does > >>>>>>>>> actually get it if present, however, doesn't show it anywhere..= . > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> I think it would be nice, if we would extend the following > >> functions: > >>>>>>>>> > >>>>>>>>> - extend the possibility to specify multiple entry classes in > >>>>>>>>> "program-class" or "Main-Class" -> in this case, the user needs > to > >>>> use > >>>>>>>>> "-c" flag to pick program to run every time > >>>>>>>>> > >>>>>>>>> - add a CLI option that allows the user to see what entry poin= t > >>>>>> classes > >>>>>>>>> are available > >>>>>>>>> for this, consider > >>>>>>>>> a) "program-class" entry > >>>>>>>>> b) "Main-Class" entry > >>>>>>>>> c) if neither is found, scan jar-file for classes > implementing > >>>>>>>>> "Program" interface > >>>>>>>>> d) if still not found, scan jar-file for classes with "mai= n" > >>>>>> method > >>>>>>>>> > >>>>>>>>> - if user looks for entry point classes via CLI, check for > >>>>>>>>> "ProgramDesciption" interface and show info > >>>>>>>>> > >>>>>>>>> - extend WebUI to show all available entry-classes (pull reque= st > >>>>>>>>> already there, for multiple entries in "program-class") > >>>>>>>>> > >>>>>>>>> - extend WebUI to show "ProgramDescription" info > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> What do you think? I am not too sure about the "auto scan" of t= he > >> jar > >>>>>>>>> file if no manifest entry is provided. We might get some "fat > jars" > >>>> and > >>>>>>>>> scanning might take some time. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -Matthias > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 05/19/2015 10:44 AM, Stephan Ewen wrote: > >>>>>>>>>> We actually has an interface like that before ("Program"). It = is > >>>> still > >>>>>>>>>> supported, but in all new programs we simply use the Java main > >>>> method. > >>>>>>>>> The > >>>>>>>>>> advantage is that > >>>>>>>>>> most IDEs can create executable JARs automatically, setting th= e > >> JAR > >>>>>>>>>> manifest attributes, etc. > >>>>>>>>>> > >>>>>>>>>> The "Program" interface still works, though. Most tool classes > >> (like > >>>>>>>>>> "PackagedProgram") have a way to figure out whether the code > uses > >>>>>>>>> "main()" > >>>>>>>>>> or implements "Program" > >>>>>>>>>> and calls the right method. > >>>>>>>>>> > >>>>>>>>>> You can try and extend the program interface. If you want to > >>>>>> consistently > >>>>>>>>>> support multiple programs in one JAR file, you may need to > adjust > >>>> the > >>>>>>>>> util > >>>>>>>>>> classes as > >>>>>>>>>> well to deal with that. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, May 19, 2015 at 10:10 AM, Matthias J. Sax < > >>>>>>>>>> mjsax@informatik.hu-berlin.de> wrote: > >>>>>>>>>> > >>>>>>>>>>> Supporting an interface like this seems to be a nice idea. An= y > >>>> other > >>>>>>>>>>> opinions on it? > >>>>>>>>>>> > >>>>>>>>>>> It seems to be some more work to get it done right. I don't > want > >> to > >>>>>>>>>>> start working on it, before it's clear that it has a chance t= o > be > >>>>>>>>>>> included in Flink. > >>>>>>>>>>> > >>>>>>>>>>> @Flavio: I moved the discussion to dev mailing list (user lis= t > is > >>>> not > >>>>>>>>>>> appropriate for this discussion). Are you subscribed to it or > >>>> should > >>>>>> I > >>>>>>>>>>> cc you in each mail? > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> -Matthias > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On 05/19/2015 09:39 AM, Flavio Pompermaier wrote: > >>>>>>>>>>>> Nice feature Matthias! > >>>>>>>>>>>> My suggestion is to create a specific Flink interface to get > >> also > >>>>>>>>>>>> description of a job and standardize parameter passing. > >>>>>>>>>>>> Then, somewhere (e.g. Manifest) you could specify the list o= f > >>>>>> packages > >>>>>>>>>>> (or > >>>>>>>>>>>> also directly the classes) to inspect with reflection to > extract > >>>> the > >>>>>>>>> list > >>>>>>>>>>>> of available Flink jobs. > >>>>>>>>>>>> Something like: > >>>>>>>>>>>> > >>>>>>>>>>>> public interface FlinkJob { > >>>>>>>>>>>> > >>>>>>>>>>>> /** The name to display in the job submission UI or shell */ > >>>>>>>>>>>> //e.g. "My Flink HelloWorld" > >>>>>>>>>>>> String getDisplayName(); > >>>>>>>>>>>> //e.g. "This program does this and that etc.." > >>>>>>>>>>>> String getDescription(); > >>>>>>>>>>>> //e.g. <0,Integer,"An integer representing my first param">= , > >>>>>>>>>>> <1,String,"An > >>>>>>>>>>>> string representing my second param"> > >>>>>>>>>>>> List> paramDescription; > >>>>>>>>>>>> /** Set up the flink job in the passed ExecutionEnvironment > */ > >>>>>>>>>>>> ExecutionEnvironment config(ExecutionEnvironment env); > >>>>>>>>>>>> } > >>>>>>>>>>>> > >>>>>>>>>>>> What do you think? > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> On Sun, May 17, 2015 at 10:38 PM, Matthias J. Sax < > >>>>>>>>>>>> mjsax@informatik.hu-berlin.de> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Hi, > >>>>>>>>>>>>> > >>>>>>>>>>>>> I like the idea that Flink's WebClient can show different > plans > >>>> for > >>>>>>>>>>>>> different jobs within a single jar file. > >>>>>>>>>>>>> > >>>>>>>>>>>>> I prepared a prototype for this feature. You can find it > here: > >>>>>>>>>>>>> https://github.com/mjsax/flink/tree/multipleJobsWebUI > >>>>>>>>>>>>> > >>>>>>>>>>>>> To test the feature, you need to prepare a jar file, that > >>>> contains > >>>>>> the > >>>>>>>>>>>>> code of multiple programs and specify each entry class in t= he > >>>>>> manifest > >>>>>>>>>>>>> file as comma separated values in "program-class" line. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Feedback is welcome. :) > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> -Matthias > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 05/08/2015 03:08 PM, Flavio Pompermaier wrote: > >>>>>>>>>>>>>> Thank you all for the support! > >>>>>>>>>>>>>> It will be a really nice feature if the web client could b= e > >> able > >>>>>> to > >>>>>>>>>>> show > >>>>>>>>>>>>>> me the list of Flink jobs within my jar.. > >>>>>>>>>>>>>> it should be sufficient to mark them with a special > annotation > >>>> and > >>>>>>>>>>>>>> inspect the classes within the jar.. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 8, 2015 at 3:03 PM, Malte Schwarzer >>>>>>>>>>>>>> > wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi Flavio, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> you also can put each job in a single class and use th= e > =E2=80=93c > >>>>>>>>>>> parameter > >>>>>>>>>>>>>> to execute jobs separately: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> /bin/flink run =E2=80=93c com.myflinkjobs.JobA > >>>>>>>>>>> /path/to/jar/multiplejobs.jar > >>>>>>>>>>>>>> /bin/flink run =E2=80=93c com.myflinkjobs.JobB > >>>>>>>>>>> /path/to/jar/multiplejobs.jar > >>>>>>>>>>>>>> =E2=80=A6 > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Cheers > >>>>>>>>>>>>>> Malte > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Von: Robert Metzger >>>>>>>>>>> rmetzger@apache.org > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> Antworten an: >>>>>>>>> user@flink.apache.org > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Datum: Freitag, 8. Mai 2015 14:57 > >>>>>>>>>>>>>> An: "user@flink.apache.org user@flink.apache.org > >>> " > >>>>>>>>>>>>>> > > >>>>>>>>>>>>>> Betreff: Re: Package multiple jobs in a single jar > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi Flavio, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> the pom from our quickstart is a good > >>>>>>>>>>>>>> reference: > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>> > >>>> > >> > https://github.com/apache/flink/blob/master/flink-quickstart/flink-quicks= tart-java/src/main/resources/archetype-resources/pom.xml > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 8, 2015 at 2:53 PM, Flavio Pompermaier > >>>>>>>>>>>>>> > > >> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Ok, get it. > >>>>>>>>>>>>>> And is there a reference pom.xml for shading my > >>>>>> application > >>>>>>>>>>> into > >>>>>>>>>>>>>> one fat-jar? which flink dependencies can I exclud= e? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 8, 2015 at 1:05 PM, Fabian Hueske < > >>>>>>>>>>> fhueske@gmail.com > >>>>>>>>>>>>>> > wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I didn't say that the main should return the > >>>>>>>>>>>>>> ExecutionEnvironment. > >>>>>>>>>>>>>> You can define and execute as many programs in= a > >>>> main > >>>>>>>>>>>>>> function as you like. > >>>>>>>>>>>>>> The program can be defined somewhere else, e.g= ., > >> in > >>>> a > >>>>>>>>>>>>>> function that receives an ExecutionEnvironment > and > >>>>>>>>> attaches > >>>>>>>>>>>>>> a program such as > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> public void buildMyProgram(ExecutionEnvironmen= t > >>>> env) { > >>>>>>>>>>>>>> DataSet lines =3D env.readTextFile(.= ..); > >>>>>>>>>>>>>> // do something > >>>>>>>>>>>>>> lines.writeAsText(...); > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> That method could be invoked from main(): > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> psv main() { > >>>>>>>>>>>>>> ExecutionEnv env =3D ... > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> if(...) { > >>>>>>>>>>>>>> buildMyProgram(env); > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> else { > >>>>>>>>>>>>>> buildSomeOtherProg(env); > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> env.execute(); > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> // run some more programs > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2015-05-08 12:56 GMT+02:00 Flavio Pompermaier > >>>>>>>>>>>>>> >> pompermaier@okkam.it > >>>>>> : > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi Fabian, > >>>>>>>>>>>>>> thanks for the response. > >>>>>>>>>>>>>> So my mains should be converted in a metho= d > >>>>>> returning > >>>>>>>>>>>>>> the ExecutionEnvironment. > >>>>>>>>>>>>>> However it think that it will be very nice > to > >>>>>> have a > >>>>>>>>>>>>>> syntax like the one of the Hadoop > >> ProgramDriver > >>>> to > >>>>>>>>>>>>>> define jobs to invoke from a single root > >> class. > >>>>>>>>>>>>>> Do you think it could be useful? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Fri, May 8, 2015 at 12:42 PM, Fabian > Hueske > >>>>>>>>>>>>>> fhueske@gmail.com > >>>> > >>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> You easily have multiple Flink program= s > >> in a > >>>>>>>>> single > >>>>>>>>>>>>>> JAR file. > >>>>>>>>>>>>>> A program is defined using an > >>>>>>>>> ExecutionEnvironment > >>>>>>>>>>>>>> and executed when you call > >>>>>>>>>>>>>> ExecutionEnvironment.exeucte(). > >>>>>>>>>>>>>> Where and how you do that does not > matter. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> You can for example implement a main > >>>> function > >>>>>>>>> such > >>>>>>>>>>>>> as: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> public static void main(String... args= ) > { > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> if (today =3D=3D Monday) { > >>>>>>>>>>>>>> ExecutionEnvironment env =3D ... > >>>>>>>>>>>>>> // define Monday prog > >>>>>>>>>>>>>> env.execute() > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> else { > >>>>>>>>>>>>>> ExecutionEnvironment env =3D ... > >>>>>>>>>>>>>> // define other prog > >>>>>>>>>>>>>> env.execute() > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> } > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2015-05-08 11:41 GMT+02:00 Flavio > >>>> Pompermaier > >>>>>>>>>>>>>> >>>>>>>>> pompermaier@okkam.it > >>>>>>>>>>>>>>> : > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Hi to all, > >>>>>>>>>>>>>> is there any way to keep multiple > jobs > >>>> in > >>>>>> a > >>>>>>>>> jar > >>>>>>>>>>>>>> and then choose at runtime the one > to > >>>>>> execute > >>>>>>>>>>>>>> (like what ProgramDriver does in > >>>> Hadoop)? > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>> Flavio > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>> > >> > >> > > > > --001a113a95f42bee7c0516fd8901--