flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Neumann <mneum...@sics.se>
Subject Re: streaming GroupBy + Fold
Date Mon, 05 Oct 2015 19:59:52 GMT
Hej,

Sorry it took so long to respond I needed to check if I was actually
allowed to share the code since it uses internal datasets.

In the appendix of this email you will find the main class of this job
without the supporting classes or the actual dataset. If you want to run it
you need to replace the dataset by something else but that should be
trivial.
If you just want to see the problem itself, have a look at the appended log
in conjunction with the code. Each ERROR printout in the log relates to an
accumulator receiving wrong values.

cheers Martin

On Sat, Oct 3, 2015 at 11:29 AM, Márton Balassi <balassi.marton@gmail.com>
wrote:

> Hey,
>
> Thanks for reporting the problem, Martin. I have not merged the PR Stephan
> is referring to yet. [1] There I am cleaning up some of the internals too.
> Just out of curiosity, could you share the code for the failing test
> please?
>
> [1] https://github.com/apache/flink/pull/1155
>
> On Fri, Oct 2, 2015 at 8:26 PM, Martin Neumann <mneumann@sics.se> wrote:
>
> > One of my colleagues found it today when we where hunting bugs today. We
> > where using the latest 0.10 version pulled from maven this morning.
> > The program we where testing is new code so I cant tell you if the
> behavior
> > has changed or if it was always like this.
> >
> > On Fri, Oct 2, 2015 at 7:46 PM, Stephan Ewen <sewen@apache.org> wrote:
> >
> > > I think these operations were recently moved to the internal state
> > > interface. Did the behavior change then?
> > >
> > > @Marton or Gyula, can you comment? Is it per chance not mapped to the
> > > partitioned state?
> > >
> > > On Fri, Oct 2, 2015 at 6:37 PM, Martin Neumann <mneumann@sics.se>
> wrote:
> > >
> > > > Hej,
> > > >
> > > > In one of my Programs I run a Fold on a GroupedDataStream. The aim is
> > to
> > > > aggregate the values in each group.
> > > > It seems the aggregator in the Fold function is shared on operator
> > level,
> > > > so all groups that end up on the same operator get mashed together.
> > > >
> > > > Is this the wanted behavior? If so, what do I have to do to separate
> > > them?
> > > >
> > > >
> > > > cheers Martin
> > > >
> > >
> >
>

Mime
View raw message