flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andra Lungu <lungu.an...@gmail.com>
Subject Re: The correct location for zipWithIndex and zipWithUniqueId
Date Fri, 12 Jun 2015 15:11:58 GMT
Thanks for the replies!

I will add the two methods in a DataSetUtils separate class. Where would
you put the documentation for this? I think users should be able to easily
access it. This means that it, IMO, it shouldn't go in a separate zip page,
but rather in the programming guide. Or there could be a link in the
DataSet Transformations page poining to this...

What do you think?

On Wed, Jun 10, 2015 at 12:33 PM, Till Rohrmann <till.rohrmann@gmail.com>
wrote:

> I agree with Theo. I think it’s a nice feature to have as part of the
> standard API because only few users will be aware of something like
> DataSetUtils. However, as a first version we can make it part of
> DataSetUtils.
>
> Cheers,
> Till
> ​
>
> On Wed, Jun 10, 2015 at 11:52 AM Theodore Vasiloudis <
> theodoros.vasiloudis@gmail.com> wrote:
>
> > +1 for Fabian, but I would very much like to see this as part of the API
> in
> > the future.
> >
> > This function would be very useful for FlinkML as well, as we noted in a
> > recent discussion on the mailing list regarding time series datasets.
> >
> > On Wed, Jun 10, 2015 at 10:56 AM, Fabian Hueske <fhueske@gmail.com>
> wrote:
> >
> > > As Andra said, I'd would not add it to the API at this point.
> > > However, I don't think it should go into a separate Maven module
> > > (flink-contrib) that needs to be added as dependency but rather into
> some
> > > DataSetUtils class in flink-java.
> > >
> > > We can easily add it to the API later, if necessary. We should however,
> > > extend the documentation such that users are aware of the DataSetUtils.
> > >
> > > Cheers, Fabian
> > >
> > > 2015-06-10 10:45 GMT+02:00 Andra Lungu <andra@apache.org>:
> > >
> > > > Hey everyone,
> > > >
> > > > We needed to assign unique labels as vertex values in Gelly at some
> > > point.
> > > > We got a nice suggestion on how to do that in parallel (Implemented
> in
> > > > https://github.com/apache/flink/pull/801#issuecomment-110654447).
> > > >
> > > > Now the question is where should these two functions go? Should they
> be
> > > > part of the API? Something like:
> > > >
> > > > class DataSet<T> {
> > > >   public DataSet<Tuple2<Long, T>> zipWithID() {}
> > > > }
> > > >
> > > > or should they go in flink-contrib? Fabian, Robert and Till seem to
> be
> > > > in favour of
> > > > the second option.
> > > >
> > > > Thanks!
> > > >
> > > > Andra
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message