flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Michels <...@apache.org>
Subject Re: Advice on [FLINK-2021]: Rework examples to use new ParameterTool
Date Mon, 14 Sep 2015 10:13:33 GMT
Hi Behrouz,

It makes sense then to change all of the examples then. This should
pretty much be copy/paste once you have a default parameter parsing
configuration.

It is a bit tricky to debug what is happening inside the
MiniYARNCluster. The logs are actually there (under target/) but they
are hard to read because of the cryptic container names. We could at
least improve that for situations where the mini cluster fails by
grepping through the logs to check for Exceptions.

Cheers,
Max



On Thu, Sep 10, 2015 at 3:45 PM, Behrouz Derakhshan
<behrouz.derakhshan@gmail.com> wrote:
> Well the problem is that "YARNSessionFIFOITCase" is testing Yarn by
> running *word
> count* example using multiple jar files.
>
> So what happens is that, the test is calling this :
>
> Runner runner = startWithArgs(new String[]{"run", "-m", "yarn-cluster",
> "-yj", flinkUberjar.getAbsolutePath(),
>             "-yn", "1",
>             "-yjm", "768",
>             "-yD", "yarn.heap-cutoff-ratio=0.5", // test if the cutoff is
> passed correctly
>             "-ytm", "1024",
>             "-ys", "2", // test requesting slots from YARN.
>             "--yarndetached", job, tmpInFile.getAbsoluteFile().toString() ,
> tmpOutFolder.getAbsoluteFile().toString()},
>       "The Job has been submitted with JobID",
> RunTypes.CLI_FRONTEND);
>
> For several jar files in different packages, that means if I want this test
> to pass, all the word count examples should use the same argument formats.
> All in all it was a bit confusing and took me awhile to figure out while
> the tests were failing, I ran into the same problem specified here:
> https://issues.apache.org/jira/browse/FLINK-1601 , and current logs does
> not specify what the underlying issue is, it just says "Runner thread died
> before the test was finished. Return value = 1"  .
>
> I think it is a good idea to improve flink-yarn-tests package by adding
> more meaning full logs.
>
> On Thu, Sep 10, 2015 at 3:28 PM, Maximilian Michels <mxm@apache.org> wrote:
>
>> I think the primary concern was flink-examples but if you're on it,
>> you can also modify the other examples.
>>
>> On Thu, Sep 10, 2015 at 12:43 PM, Behrouz Derakhshan
>> <behrouz.derakhshan@gmail.com> wrote:
>> > Hi,
>> >
>> > So my understanding was that the changes are only meant for
>> flink-examples
>> > package. But each package has its own set of examples.
>> > And all of them has to be changed.
>> > Is that OK?
>> >
>> > @Ufuk: I agree, I create a ticket for adding Javadocs.
>> >
>> > BR,
>> > Behrouz
>> >
>> >
>> > On Wed, Sep 9, 2015 at 3:53 PM, Maximilian Michels <mxm@apache.org>
>> wrote:
>> >
>> >> It would be nice to support both non-positional and positional
>> >> arguments. Like in
>> >>
>> >> > posarg1 posarg2 --nonpos1 nonpos1value --nonpos2 nonpos2value
>> >>
>> >> The arguments should also be named but should be expected at a fixed
>> >> position counting from the left ignoring non-positional arguments.
>> >>
>> >> For the time being, it would also be ok with me if we ported all
>> >> examples to non-positional arguments.
>> >>
>> >> On Fri, Sep 4, 2015 at 2:46 PM, Behrouz Derakhshan
>> >> <behrouz.derakhshan@gmail.com> wrote:
>> >> > Yes, I was referring mostly to blog posts and other websites and was
>> >> > wondering if breaking them is an issue or not.
>> >> > I have already created a subtask to add support for positional
>> arguments
>> >> (
>> >> > FLINK-2621 <https://issues.apache.org/jira/browse/FLINK-2621>),
so
>> the
>> >> > examples would be backward compatible.
>> >> > The problem with that is, we have to detect from the arguments to the
>> >> > program, if they are positional or key/value and parse them
>> accordingly.
>> >> > But if everyone is OK with completely switching to ParameterTool and
>> >> > breaking the support for the old way of executing the examples, then
>> my
>> >> job
>> >> > would be also a lot easier.
>> >> >
>> >> >
>> >> >
>> >> > On Fri, Sep 4, 2015 at 2:34 PM, Robert Metzger <rmetzger@apache.org>
>> >> wrote:
>> >> >
>> >> >> If you are referring to this training material (
>> >> >>
>> >> >>
>> >>
>> https://github.com/dataArtisans/flink-training-exercises/blob/master/src/main/java/com/dataArtisans/flinkTraining/exercises/dataStreamJava/rideCleansing/RideCleansing.java
>> >> >> ),
>> >> >> some of the examples are actually already using the ParameterTool.
>> >> >>
>> >> >> The problem are probably websites / blogposts etc. that show how
to
>> use
>> >> the
>> >> >> Flink examples. But I think its fine to break these. All example
jars
>> >> >> contain the version number. If the way we pass arguments to the
>> examples
>> >> >> changes between 0.9 and 0.10, that should be fine.
>> >> >>
>> >> >> I think using the ParameterTool for the examples will improve the
>> >> >> readability of the examples a lot. Right now, all examples have
a
>> >> >> (copy-pasted) parseParameters() method, which is doing very
>> simplistic
>> >> >> parameter parsing.
>> >> >>
>> >> >> The PT tool also allows to show the input parameters in the web
>> >> interface.
>> >> >>
>> >> >> So I'm voting for doing a breaking change and using parameters
such
>> as
>> >> >> "--input hdfs:/// --output hdfs:/// --iterations 15".
>> >> >>
>> >> >> On Fri, Sep 4, 2015 at 1:05 PM, Behrouz Derakhshan <
>> >> >> behrouz.derakhshan@gmail.com> wrote:
>> >> >>
>> >> >> > Will do.
>> >> >> >
>> >> >> > Thanks,
>> >> >> > Behrouz
>> >> >> >
>> >> >> > On Fri, Sep 4, 2015 at 11:29 AM, Maximilian Michels <
>> mxm@apache.org>
>> >> >> > wrote:
>> >> >> >
>> >> >> > > Hi Behrouz,
>> >> >> > >
>> >> >> > > I would create a new sub-task under the original issue
that
>> >> introduce
>> >> >> > > the ParameterTool:
>> https://issues.apache.org/jira/browse/FLINK-1525
>> >> >> > >
>> >> >> > > Cheers,
>> >> >> > > Max
>> >> >> > >
>> >> >> > > On Fri, Sep 4, 2015 at 11:17 AM, Behrouz Derakhshan
>> >> >> > > <behrouz.derakhshan@gmail.com> wrote:
>> >> >> > > > Hi Max,
>> >> >> > > >
>> >> >> > > > What you said makes sense, for "ParameterTool doesn't
seem to
>> >> support
>> >> >> > > > positional arguments :) but we could fix that."
should we
>> create a
>> >> >> > > separate
>> >> >> > > > ticket or should it also be part of FLINK-2021 ?
>> >> >> > > >
>> >> >> > > > BR,
>> >> >> > > > Behrouz
>> >> >> > > >
>> >> >> > > >
>> >> >> > > > On Fri, Sep 4, 2015 at 10:55 AM, Maximilian Michels
<
>> >> mxm@apache.org>
>> >> >> > > wrote:
>> >> >> > > >
>> >> >> > > >> Hi Behrouz,
>> >> >> > > >>
>> >> >> > > >> Thanks for starting the discussion. If I understand
your
>> question
>> >> >> > > >> correctly, you are asking if it breaks the training
or other
>> >> >> external
>> >> >> > > >> material if we convert the Flink examples to
make use of the
>> >> >> > > >> ParameterTool?
>> >> >> > > >>
>> >> >> > > >> We could make the changes such that the examples
will accept
>> the
>> >> >> same
>> >> >> > > >> parameters but use the ParameterTool internally
to verify the
>> >> >> > > >> parameters and print usage information. I think
most examples
>> >> simply
>> >> >> > > >> use positional arguments and we could keep it
that way. The
>> only
>> >> >> > > >> problem is that the ParameterTool doesn't seem
to support
>> >> positional
>> >> >> > > >> arguments :) but we could fix that.
>> >> >> > > >>
>> >> >> > > >> Cheers,
>> >> >> > > >> Max
>> >> >> > > >>
>> >> >> > > >> On Thu, Sep 3, 2015 at 5:50 PM, Behrouz Derakhshan
>> >> >> > > >> <behrouz.derakhshan@gmail.com> wrote:
>> >> >> > > >> > Hi,
>> >> >> > > >> >
>> >> >> > > >> > I had at look at this ticket FLINK-2021
>> >> >> > > >> > <https://issues.apache.org/jira/browse/FLINK-2021>,
there
>> >> isn't
>> >> >> > much
>> >> >> > > to
>> >> >> > > >> do
>> >> >> > > >> > from a technical stand point and it kinda
makes sense to use
>> >> the
>> >> >> new
>> >> >> > > >> > "ParameterTool", since it is being used
in most of the other
>> >> part
>> >> >> of
>> >> >> > > the
>> >> >> > > >> > code base.
>> >> >> > > >> > The only question is do we really want
to do it, since I'm
>> >> >> guessing
>> >> >> > > some
>> >> >> > > >> of
>> >> >> > > >> > the training materials, slides and articles
are referencing
>> >> these
>> >> >> > > >> examples
>> >> >> > > >> > and updating those might be a burden.
>> >> >> > > >> >
>> >> >> > > >> > Let me know what you guys think, either
I can start working
>> on
>> >> it
>> >> >> or
>> >> >> > > we
>> >> >> > > >> can
>> >> >> > > >> > just resolve it for good.
>> >> >> > > >> >
>> >> >> > > >> > Cheers,
>> >> >> > > >> > Behrouz
>> >> >> > > >>
>> >> >> > >
>> >> >> >
>> >> >>
>> >>
>>

Mime
View raw message