flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Package multiple jobs in a single jar
Date Wed, 27 May 2015 09:18:16 GMT
Hi Matthias,

I understand your point about "advertising" the interfaces but there is so
much stuff to be advertised :). Honestly, I think ProgramDescription
doesn't add much value although it is kind of neat. Parameters can be
described in the code or by displaying a help message. However, I'm in
favor of making it easier to list all executable classes in a JAR.
Therefore, I like your proposed changes. I just don't see much of a use of
the Program or ProgramDescription interface in the examples. That's just my
personal opinion.

Best regards,
Max


On Tue, May 26, 2015 at 5:10 PM, Flavio Pompermaier <pompermaier@okkam.it>
wrote:

> I agree with Matthias,I didn't know about ProgramDesciption and Program
> Interfaces because they are not advertised anywhere..
>
> On Tue, May 26, 2015 at 5:01 PM, Matthias J. Sax <
> mjsax@informatik.hu-berlin.de> wrote:
>
> > I see your point.
> >
> > However, right now only few people are aware of "ProgramDesciption"
> > interface. If we want to "advertise" for it, it should be used (at
> > least) in a few examples. Otherwise, people will never use it, and the
> > changes I plan to apply are kind of useless. I would even claim, that
> > the interface should be removed completely is this case...
> >
> >
> > On 05/26/2015 03:31 PM, Maximilian Michels wrote:
> > > Sorry, my bad. Yes, it is helpful to have a separate program and
> > parameter
> > > description in ProgramDescription. I'm not sure if it adds much value
> to
> > > implement ProgramDescription in the examples. It introduces verbosity
> and
> > > might give the impression that you have to implement ProgramDescription
> > in
> > > your Flink job.
> > >
> > > On Tue, May 26, 2015 at 12:00 PM, Matthias J. Sax <
> > > mjsax@informatik.hu-berlin.de> wrote:
> > >
> > >> Hi Max,
> > >>
> > >> thanks for your feedback. I guess you confuse the interfaces "Program"
> > >> and "ProgramDescription". Using "Program" the use of main method is
> > >> replaced by "getPlan(...)". However, "ProgramDescription" only adds
> > >> method "getDescription()" which returns a string that explains the
> usage
> > >> of the program (ie, short description, expected parameters).
> > >>
> > >> Thus, adding "ProgramDescription" to the examples, does not change the
> > >> examples -- main method will still be uses. It only adds the ability
> > >> that a program "explains" itself (ie, give meta info). Furhtermore,
> > >> "ProgramDescription" is also not related to the new "ParameterTool".
> > >>
> > >> -Matthias
> > >>
> > >> On 05/26/2015 11:46 AM, Maximilian Michels wrote:
> > >>> I don't think `getDisplayName()` is necessary either. The class name
> > and
> > >>> the description string should be fine. Adding ProgramDescription to
> the
> > >>> examples is not necessary; as already pointed out, using the main
> > method
> > >> is
> > >>> more convenient for most users. As far as I know, the idea of the
> > >>> ParameterTool was to use it only in the user code and not
> automatically
> > >>> handle parameters.
> > >>>
> > >>> Changing the interface would be quite API breaking but since most
> > >> programs
> > >>> use the main method, IMHO we could do it.
> > >>>
> > >>> On Fri, May 22, 2015 at 10:09 PM, Matthias J. Sax <
> > >>> mjsax@informatik.hu-berlin.de> wrote:
> > >>>
> > >>>> Makes sense to me. :)
> > >>>>
> > >>>> One more thing: What about extending the "ProgramDescription"
> > interface
> > >>>> to have multiple methods as Flavio suggested (with the config(...)
> > >>>> method that should be handle by the ParameterTool)
> > >>>>
> > >>>>> public interface FlinkJob {
> > >>>>>
> > >>>>> /** The name to display in the job submission UI or shell */
> > >>>>> //e.g. "My Flink HelloWorld"
> > >>>>> String getDisplayName();
> > >>>>> //e.g. "This program does this and that etc.."
> > >>>>> String getDescription();
> > >>>>> //e.g. <0,Integer,"An integer representing my first param">,
> > >>>> <1,String,"An string representing my second param">
> > >>>>> List<Tuple3<Integer, TypeInfo, String>> paramDescription;
> > >>>>> /** Set up the flink job in the passed ExecutionEnvironment
*/
> > >>>>> ExecutionEnvironment config(ExecutionEnvironment env);
> > >>>>> }
> > >>>>
> > >>>> Right now, the interface is used only a couple of times in Flink's
> > code
> > >>>> base, so it would not be a problem to update those classes. However,
> > it
> > >>>> could break external code that uses the interface already (even
if I
> > >>>> doubt that the interface is well known and used often [or at all]).
> > >>>>
> > >>>> I personally don't think, that "getDiplayName()" to too helpful.
> > >>>> Splitting the program description and the parameter description
> seems
> > to
> > >>>> be useful. For example, if wrong parameters are provided, the
> > parameter
> > >>>> description can be included in the error message. If
> program+parameter
> > >>>> description is given in a single string, this is not possible.
But
> > this
> > >>>> is only a minor issue of course.
> > >>>>
> > >>>> Maybe, we should also add the interface to the current Flink
> examples,
> > >>>> to make people more aware of it. Is there any documentation on
the
> web
> > >>>> site.
> > >>>>
> > >>>>
> > >>>> -Matthias
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 05/22/2015 09:43 PM, Robert Metzger wrote:
> > >>>>> Thank you for working on this.
> > >>>>> My responses are inline below:
> > >>>>>
> > >>>>> (Flavio)
> > >>>>>
> > >>>>>> My suggestion is to create a specific Flink interface to
get also
> > >>>>>> description of a job and standardize parameter passing.
> > >>>>>
> > >>>>>
> > >>>>> I've recently merged the ParameterTool which is solving the
> > >> "standardize
> > >>>>> parameter passing" problem (at least it presents a best practice)
:
> > >>>>>
> > >>>>
> > >>
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/best_practices.html#parsing-command-line-arguments-and-passing-them-around-in-your-flink-application
> > >>>>>
> > >>>>> Regarding the description: Maybe we can use the
> "ProgramDescription"
> > >>>>> interface for getting a string describing the program in the
web
> > >>>> frontend.
> > >>>>>
> > >>>>> (Matthias)
> > >>>>>
> > >>>>>> I don't want to start working on it, before it's clear
that it
> has a
> > >>>>>> chance to be
> > >>>>>> included in Flink.
> > >>>>>
> > >>>>>
> > >>>>> I think the changes discussed here won't change the current
> behavior,
> > >> but
> > >>>>> they add new functionality which
> > >>>>> can make the life of our users easier, so I'll vote to include
your
> > >>>> changes
> > >>>>> (given they meet our quality standards)
> > >>>>>
> > >>>>>
> > >>>>> If multiple classes implement "Program" interface an exception
> should
> > >> be
> > >>>>>> through (I think that would make sense). However, I am
not sure
> was
> > >>>>>> "good" behavior is, if a single "Program"-class is found
and an
> > >>>>>> additional main-method class.
> > >>>>>>   - should "Program"-class be executed (ie, "overwrite"
> main-method
> > >>>> class)
> > >>>>>>   - or, better to through an exception ?
> > >>>>>
> > >>>>>
> > >>>>> I would give a class implementing "Program" priority over a
random
> > >> main()
> > >>>>> method in a random class.
> > >>>>> Maybe printing a WARN log message informing the user that the
> > "Program"
> > >>>>> class has been choosen.
> > >>>>>
> > >>>>>
> > >>>>> If no "Program"-class is found, but a single main-method class,
> Flink
> > >>>>>> could execute using main method. But I am not sure either,
if this
> > is
> > >>>>>> "good" behavior. If multiple main-method classes are present,
> > throwing
> > >>>>>> and exception is the only way to got, I guess.
> > >>>>>
> > >>>>>
> > >>>>> I think the best effort approach "one class with main() found"
is
> > good.
> > >>>> In
> > >>>>> case of multiple main methods, a helpful exception is the best
> > approach
> > >>>> in
> > >>>>> my opinion.
> > >>>>>
> > >>>>>
> > >>>>>  If the manifest contains "program-class" or "Main-Class" entry,
> > >>>>>> should we check the jar file right away if the specified
class is
> > >> there?
> > >>>>>> Right now, no check is performed and an error occurs if
the user
> > tries
> > >>>>>> to execute the job.
> > >>>>>
> > >>>>>
> > >>>>> I'd say the current approach is sufficient. There is no need
to
> have
> > a
> > >>>>> special code path which is doing the check.
> > >>>>> I think the error message will be pretty similar in both cases
and
> I
> > >> fear
> > >>>>> that this additional code could also introduce new bugs ;)
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On Fri, May 22, 2015 at 9:06 PM, Matthias J. Sax <
> > >>>>> mjsax@informatik.hu-berlin.de> wrote:
> > >>>>>
> > >>>>>> Hi,
> > >>>>>>
> > >>>>>> two more thoughts to this discussion:
> > >>>>>>
> > >>>>>>  1) looking at the commit history of "CliFrontend", I found
the
> > >>>>>> following closed issue and the closing pull request
> > >>>>>>     * https://issues.apache.org/jira/browse/FLINK-1095
> > >>>>>>     * https://github.com/apache/flink/pull/238
> > >>>>>> It stand in opposite of Flavio's request to have a job
> description.
> > >> Any
> > >>>>>> comment on this? Should a removed feature be re-introduced?
If
> not,
> > I
> > >>>>>> would suggest to remove the "ProgramDescription" interface
> > completely.
> > >>>>>>
> > >>>>>>  2) If the manifest contains "program-class" or "Main-Class"
> entry,
> > >>>>>> should we check the jar file right away if the specified
class is
> > >> there?
> > >>>>>> Right now, no check is performed and an error occurs if
the user
> > tries
> > >>>>>> to execute the job.
> > >>>>>>
> > >>>>>>
> > >>>>>> -Matthias
> > >>>>>>
> > >>>>>>
> > >>>>>> On 05/22/2015 12:06 PM, Matthias J. Sax wrote:
> > >>>>>>> Thanks for your feedback.
> > >>>>>>>
> > >>>>>>> I agree on the main method "problem". For scanning
and listing
> all
> > >>>> stuff
> > >>>>>>> that is found it's fine.
> > >>>>>>>
> > >>>>>>> The tricky question is the automatic invocation mechanism,
if
> "-c"
> > >> flag
> > >>>>>>> is not used, and no manifest program-class or Main-Class
entry is
> > >>>> found.
> > >>>>>>>
> > >>>>>>> If multiple classes implement "Program" interface an
exception
> > should
> > >>>> be
> > >>>>>>> through (I think that would make sense). However, I
am not sure
> was
> > >>>>>>> "good" behavior is, if a single "Program"-class is
found and an
> > >>>>>>> additional main-method class.
> > >>>>>>>   - should "Program"-class be executed (ie, "overwrite"
> main-method
> > >>>>>> class)
> > >>>>>>>   - or, better to through an exception ?
> > >>>>>>>
> > >>>>>>> If no "Program"-class is found, but a single main-method
class,
> > Flink
> > >>>>>>> could execute using main method. But I am not sure
either, if
> this
> > is
> > >>>>>>> "good" behavior. If multiple main-method classes are
present,
> > >> throwing
> > >>>>>>> and exception is the only way to got, I guess.
> > >>>>>>>
> > >>>>>>> To sum up: Should Flink consider main-method classes
for
> automatic
> > >>>>>>> invocation, or should it be required for main-method
classes to
> > >> either
> > >>>>>>> list them in "program-class" or "Main-Class" manifest
parameter
> (to
> > >>>>>>> enable them for automatic invocation)?
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> -Matthias
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 05/22/2015 09:56 AM, Maximilian Michels wrote:
> > >>>>>>>> Hi Matthias,
> > >>>>>>>>
> > >>>>>>>> Thank you for taking the time to analyze Flink's
invocation
> > >> behavior.
> > >>>> I
> > >>>>>>>> like your proposal. I'm not sure whether it is
a good idea to
> scan
> > >> the
> > >>>>>>>> entire JAR for main methods. Sometimes, main methods
are added
> > >> solely
> > >>>>>> for
> > >>>>>>>> testing purposes and don't really serve any practical
use.
> > However,
> > >> if
> > >>>>>>>> you're already going through the JAR to find the
> > ProgramDescription
> > >>>>>>>> interface, then you might look for main methods
as well. As long
> > as
> > >> it
> > >>>>>> is
> > >>>>>>>> just a listing without execution, that should be
fine.
> > >>>>>>>>
> > >>>>>>>> Best regards,
> > >>>>>>>> Max
> > >>>>>>>>
> > >>>>>>>> On Thu, May 21, 2015 at 3:43 PM, Matthias J. Sax
<
> > >>>>>>>> mjsax@informatik.hu-berlin.de> wrote:
> > >>>>>>>>
> > >>>>>>>>> Hi,
> > >>>>>>>>>
> > >>>>>>>>> I had a look into the current Workflow of Flink
with regard to
> > the
> > >>>>>>>>> progressing steps of a jar file.
> > >>>>>>>>>
> > >>>>>>>>> If I got it right it works as follows (not
sure if this is
> > >> documented
> > >>>>>>>>> somewhere):
> > >>>>>>>>>
> > >>>>>>>>> 1) check, if "-c" flag is used to set program
entry point
> > >>>>>>>>>    if yes, goto 4
> > >>>>>>>>> 2) try to extract "program-class" property
from manifest
> > >>>>>>>>>    (if found goto 4)
> > >>>>>>>>> 3) try to extract "Main-Class" property from
manifest
> > >>>>>>>>>    -> if not found through exception (this
happens also, if no
> > >>>> manifest
> > >>>>>>>>> file is found at all)
> > >>>>>>>>>
> > >>>>>>>>> 4) check if entry point class implements "Program"
interface
> > >>>>>>>>>    if yes, goto 6
> > >>>>>>>>> 5) check if entry point class provided "public
static void
> > >>>>>> main(String[]
> > >>>>>>>>> args)" method
> > >>>>>>>>>    -> if not, through exception
> > >>>>>>>>>
> > >>>>>>>>> 6) execute program (ie, show plan/info or really
run it)
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> I also "discovered" the interface "ProgramDescription"
with a
> > >> single
> > >>>>>>>>> method "String getDescription()". Even if some
examples
> implement
> > >>>> this
> > >>>>>>>>> interface (and use it in the example itself),
Flink basically
> > >> ignores
> > >>>>>>>>> it... From the CLI there is no way to get this
info, and the
> > WebUI
> > >>>> does
> > >>>>>>>>> actually get it if present, however, doesn't
show it
> anywhere...
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> I think it would be nice, if we would extend
the following
> > >> functions:
> > >>>>>>>>>
> > >>>>>>>>>  - extend the possibility to specify multiple
entry classes in
> > >>>>>>>>> "program-class" or "Main-Class" -> in this
case, the user needs
> > to
> > >>>> use
> > >>>>>>>>> "-c" flag to pick program to run every time
> > >>>>>>>>>
> > >>>>>>>>>  - add a CLI option that allows the user to
see what entry
> point
> > >>>>>> classes
> > >>>>>>>>> are available
> > >>>>>>>>>    for this, consider
> > >>>>>>>>>      a) "program-class" entry
> > >>>>>>>>>      b) "Main-Class" entry
> > >>>>>>>>>      c) if neither is found, scan jar-file
for classes
> > implementing
> > >>>>>>>>> "Program" interface
> > >>>>>>>>>      d) if still not found, scan jar-file for
classes with
> "main"
> > >>>>>> method
> > >>>>>>>>>
> > >>>>>>>>>  - if user looks for entry point classes via
CLI, check for
> > >>>>>>>>> "ProgramDesciption" interface and show info
> > >>>>>>>>>
> > >>>>>>>>>  - extend WebUI to show all available entry-classes
(pull
> request
> > >>>>>>>>> already there, for multiple entries in "program-class")
> > >>>>>>>>>
> > >>>>>>>>>  - extend WebUI to show "ProgramDescription"
info
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> What do you think? I am not too sure about
the "auto scan" of
> the
> > >> jar
> > >>>>>>>>> file if no manifest entry is provided. We might
get some "fat
> > jars"
> > >>>> and
> > >>>>>>>>> scanning might take some time.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> -Matthias
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> On 05/19/2015 10:44 AM, Stephan Ewen wrote:
> > >>>>>>>>>> We actually has an interface like that
before ("Program"). It
> is
> > >>>> still
> > >>>>>>>>>> supported, but in all new programs we simply
use the Java main
> > >>>> method.
> > >>>>>>>>> The
> > >>>>>>>>>> advantage is that
> > >>>>>>>>>> most IDEs can create executable JARs automatically,
setting
> the
> > >> JAR
> > >>>>>>>>>> manifest attributes, etc.
> > >>>>>>>>>>
> > >>>>>>>>>> The "Program" interface still works, though.
Most tool classes
> > >> (like
> > >>>>>>>>>> "PackagedProgram") have a way to figure
out whether the code
> > uses
> > >>>>>>>>> "main()"
> > >>>>>>>>>> or implements "Program"
> > >>>>>>>>>> and calls the right method.
> > >>>>>>>>>>
> > >>>>>>>>>> You can try and extend the program interface.
If you want to
> > >>>>>> consistently
> > >>>>>>>>>> support multiple programs in one JAR file,
you may need to
> > adjust
> > >>>> the
> > >>>>>>>>> util
> > >>>>>>>>>> classes as
> > >>>>>>>>>> well to deal with that.
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>> On Tue, May 19, 2015 at 10:10 AM, Matthias
J. Sax <
> > >>>>>>>>>> mjsax@informatik.hu-berlin.de> wrote:
> > >>>>>>>>>>
> > >>>>>>>>>>> Supporting an interface like this seems
to be a nice idea.
> Any
> > >>>> other
> > >>>>>>>>>>> opinions on it?
> > >>>>>>>>>>>
> > >>>>>>>>>>> It seems to be some more work to get
it done right. I don't
> > want
> > >> to
> > >>>>>>>>>>> start working on it, before it's clear
that it has a chance
> to
> > be
> > >>>>>>>>>>> included in Flink.
> > >>>>>>>>>>>
> > >>>>>>>>>>> @Flavio: I moved the discussion to
dev mailing list (user
> list
> > is
> > >>>> not
> > >>>>>>>>>>> appropriate for this discussion). Are
you subscribed to it or
> > >>>> should
> > >>>>>> I
> > >>>>>>>>>>> cc you in each mail?
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> -Matthias
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> On 05/19/2015 09:39 AM, Flavio Pompermaier
wrote:
> > >>>>>>>>>>>> Nice feature Matthias!
> > >>>>>>>>>>>> My suggestion is to create a specific
Flink interface to get
> > >> also
> > >>>>>>>>>>>> description of a job and standardize
parameter passing.
> > >>>>>>>>>>>> Then, somewhere (e.g. Manifest)
you could specify the list
> of
> > >>>>>> packages
> > >>>>>>>>>>> (or
> > >>>>>>>>>>>> also directly the classes) to inspect
with reflection to
> > extract
> > >>>> the
> > >>>>>>>>> list
> > >>>>>>>>>>>> of available Flink jobs.
> > >>>>>>>>>>>> Something like:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> public interface FlinkJob {
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> /** The name to display in the
job submission UI or shell */
> > >>>>>>>>>>>> //e.g. "My Flink HelloWorld"
> > >>>>>>>>>>>> String getDisplayName();
> > >>>>>>>>>>>>  //e.g. "This program does this
and that etc.."
> > >>>>>>>>>>>> String getDescription();
> > >>>>>>>>>>>>  //e.g. <0,Integer,"An integer
representing my first
> param">,
> > >>>>>>>>>>> <1,String,"An
> > >>>>>>>>>>>> string representing my second param">
> > >>>>>>>>>>>> List<Tuple3<Integer, TypeInfo,
String>> paramDescription;
> > >>>>>>>>>>>>  /** Set up the flink job in the
passed ExecutionEnvironment
> > */
> > >>>>>>>>>>>> ExecutionEnvironment config(ExecutionEnvironment
env);
> > >>>>>>>>>>>> }
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> What do you think?
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Sun, May 17, 2015 at 10:38 PM,
Matthias J. Sax <
> > >>>>>>>>>>>> mjsax@informatik.hu-berlin.de>
wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> Hi,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I like the idea that Flink's
WebClient can show different
> > plans
> > >>>> for
> > >>>>>>>>>>>>> different jobs within a single
jar file.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> I prepared a prototype for
this feature. You can find it
> > here:
> > >>>>>>>>>>>>> https://github.com/mjsax/flink/tree/multipleJobsWebUI
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> To test the feature, you need
to prepare a jar file, that
> > >>>> contains
> > >>>>>> the
> > >>>>>>>>>>>>> code of multiple programs and
specify each entry class in
> the
> > >>>>>> manifest
> > >>>>>>>>>>>>> file as comma separated values
in "program-class" line.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Feedback is welcome. :)
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> -Matthias
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On 05/08/2015 03:08 PM, Flavio
Pompermaier wrote:
> > >>>>>>>>>>>>>> Thank you all for the support!
> > >>>>>>>>>>>>>> It will be a really nice
feature if the web client could
> be
> > >> able
> > >>>>>> to
> > >>>>>>>>>>> show
> > >>>>>>>>>>>>>> me the list of Flink jobs
within my jar..
> > >>>>>>>>>>>>>> it should be sufficient
to mark them with a special
> > annotation
> > >>>> and
> > >>>>>>>>>>>>>> inspect the classes within
the jar..
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Fri, May 8, 2015 at
3:03 PM, Malte Schwarzer <
> ms@mieo.de
> > >>>>>>>>>>>>>> <mailto:ms@mieo.de>>
wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Hi Flavio,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     you also can put each
job in a single class and use
> the
> > –c
> > >>>>>>>>>>> parameter
> > >>>>>>>>>>>>>>     to execute jobs separately:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     /bin/flink run –c
com.myflinkjobs.JobA
> > >>>>>>>>>>> /path/to/jar/multiplejobs.jar
> > >>>>>>>>>>>>>>     /bin/flink run –c
com.myflinkjobs.JobB
> > >>>>>>>>>>> /path/to/jar/multiplejobs.jar
> > >>>>>>>>>>>>>>     …
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Cheers
> > >>>>>>>>>>>>>>     Malte
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Von: Robert Metzger
<rmetzger@apache.org <mailto:
> > >>>>>>>>>>> rmetzger@apache.org
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Antworten an: <user@flink.apache.org
<mailto:
> > >>>>>>>>> user@flink.apache.org
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Datum: Freitag, 8.
Mai 2015 14:57
> > >>>>>>>>>>>>>>     An: "user@flink.apache.org
<mailto:
> > user@flink.apache.org
> > >>> "
> > >>>>>>>>>>>>>>     <user@flink.apache.org
<mailto:user@flink.apache.org
> >>
> > >>>>>>>>>>>>>>     Betreff: Re: Package
multiple jobs in a single jar
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     Hi Flavio,
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     the pom from our quickstart
is a good
> > >>>>>>>>>>>>>>     reference:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>
> > >>>>
> > >>
> >
> https://github.com/apache/flink/blob/master/flink-quickstart/flink-quickstart-java/src/main/resources/archetype-resources/pom.xml
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>     On Fri, May 8, 2015
at 2:53 PM, Flavio Pompermaier
> > >>>>>>>>>>>>>>     <pompermaier@okkam.it
<mailto:pompermaier@okkam.it>>
> > >> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>         Ok, get it.
> > >>>>>>>>>>>>>>         And is there a
reference pom.xml for shading my
> > >>>>>> application
> > >>>>>>>>>>> into
> > >>>>>>>>>>>>>>         one fat-jar? which
flink dependencies can I
> exclude?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>         On Fri, May 8,
2015 at 1:05 PM, Fabian Hueske <
> > >>>>>>>>>>> fhueske@gmail.com
> > >>>>>>>>>>>>>>         <mailto:fhueske@gmail.com>>
wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>             I didn't say
that the main should return the
> > >>>>>>>>>>>>>>             ExecutionEnvironment.
> > >>>>>>>>>>>>>>             You can define
and execute as many programs
> in a
> > >>>> main
> > >>>>>>>>>>>>>>             function as
you like.
> > >>>>>>>>>>>>>>             The program
can be defined somewhere else,
> e.g.,
> > >> in
> > >>>> a
> > >>>>>>>>>>>>>>             function that
receives an ExecutionEnvironment
> > and
> > >>>>>>>>> attaches
> > >>>>>>>>>>>>>>             a program such
as
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>             public void
> buildMyProgram(ExecutionEnvironment
> > >>>> env) {
> > >>>>>>>>>>>>>>               DataSet<String>
lines =
> env.readTextFile(...);
> > >>>>>>>>>>>>>>               // do something
> > >>>>>>>>>>>>>>               lines.writeAsText(...);
> > >>>>>>>>>>>>>>             }
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>             That method
could be invoked from main():
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>             psv main()
{
> > >>>>>>>>>>>>>>               ExecutionEnv
env = ...
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>               if(...) {
> > >>>>>>>>>>>>>>                 buildMyProgram(env);
> > >>>>>>>>>>>>>>               }
> > >>>>>>>>>>>>>>               else {
> > >>>>>>>>>>>>>>                 buildSomeOtherProg(env);
> > >>>>>>>>>>>>>>               }
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>               env.execute();
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>               // run some
more programs
> > >>>>>>>>>>>>>>             }
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>             2015-05-08
12:56 GMT+02:00 Flavio Pompermaier
> > >>>>>>>>>>>>>>             <pompermaier@okkam.it
<mailto:
> > >> pompermaier@okkam.it
> > >>>>>> :
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                 Hi Fabian,
> > >>>>>>>>>>>>>>                 thanks
for the response.
> > >>>>>>>>>>>>>>                 So my mains
should be converted in a
> method
> > >>>>>> returning
> > >>>>>>>>>>>>>>                 the ExecutionEnvironment.
> > >>>>>>>>>>>>>>                 However
it think that it will be very nice
> > to
> > >>>>>> have a
> > >>>>>>>>>>>>>>                 syntax
like the one of the Hadoop
> > >> ProgramDriver
> > >>>> to
> > >>>>>>>>>>>>>>                 define
jobs to invoke from a single root
> > >> class.
> > >>>>>>>>>>>>>>                 Do you
think it could be useful?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                 On Fri,
May 8, 2015 at 12:42 PM, Fabian
> > Hueske
> > >>>>>>>>>>>>>>                 <fhueske@gmail.com
<mailto:
> > fhueske@gmail.com
> > >>>>
> > >>>>>>>>> wrote:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                     You
easily have multiple Flink
> programs
> > >> in a
> > >>>>>>>>> single
> > >>>>>>>>>>>>>>                     JAR
file.
> > >>>>>>>>>>>>>>                     A program
is defined using an
> > >>>>>>>>> ExecutionEnvironment
> > >>>>>>>>>>>>>>                     and
executed when you call
> > >>>>>>>>>>>>>>                     ExecutionEnvironment.exeucte().
> > >>>>>>>>>>>>>>                     Where
and how you do that does not
> > matter.
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                     You
can for example implement a main
> > >>>> function
> > >>>>>>>>> such
> > >>>>>>>>>>>>> as:
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                     public
static void main(String...
> args)
> > {
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                       if
(today == Monday) {
> > >>>>>>>>>>>>>>                       
 ExecutionEnvironment env = ...
> > >>>>>>>>>>>>>>                       
 // define Monday prog
> > >>>>>>>>>>>>>>                       
 env.execute()
> > >>>>>>>>>>>>>>                       }
> > >>>>>>>>>>>>>>                       else
{
> > >>>>>>>>>>>>>>                       
 ExecutionEnvironment env = ...
> > >>>>>>>>>>>>>>                       
 // define other prog
> > >>>>>>>>>>>>>>                       
 env.execute()
> > >>>>>>>>>>>>>>                       }
> > >>>>>>>>>>>>>>                     }
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                     2015-05-08
11:41 GMT+02:00 Flavio
> > >>>> Pompermaier
> > >>>>>>>>>>>>>>                     <pompermaier@okkam.it
<mailto:
> > >>>>>>>>> pompermaier@okkam.it
> > >>>>>>>>>>>>>>> :
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                       
 Hi to all,
> > >>>>>>>>>>>>>>                       
 is there any way to keep multiple
> > jobs
> > >>>> in
> > >>>>>> a
> > >>>>>>>>> jar
> > >>>>>>>>>>>>>>                       
 and then choose at runtime the one
> > to
> > >>>>>> execute
> > >>>>>>>>>>>>>>                       
 (like what ProgramDriver does in
> > >>>> Hadoop)?
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>                       
 Best,
> > >>>>>>>>>>>>>>                       
 Flavio
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message