flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Scala API rewrite almost complete
Date Mon, 08 Sep 2014 11:51:14 GMT
Instead of Strings, Object[][] would work as well. That is a generic
representation of a Tuple.

Alternatively, they could be stored as Java or Scala Tuples, with a generic
utility method to convert between the two.

On Mon, Sep 8, 2014 at 10:55 AM, Fabian Hueske <fhueske@apache.org> wrote:

> Yeah, I ran into the same problem...
>
> +1 for using Strings and parsing them,  but using the CSVFormat won't work
> because this is based on a FileInputFormat.
> So we would need to parse the Strings manually...
>
> 2014-09-08 10:35 GMT+02:00 Aljoscha Krettek <aljoscha@apache.org>:
>
> > Hi,
> > on second thought. Maybe we should just change all the example input
> > data to strings and use CSV input formats in all the examples. What do
> > you think?
> >
> > Cheers,
> > Aljoscha
> >
> > On Mon, Sep 8, 2014 at 7:46 AM, Aljoscha Krettek <aljoscha@apache.org>
> > wrote:
> > > Hi,
> > > yes it's unfortunate that the data types are incompatible. I'm afraid
> > > you have to to what you proposed: move the data to a static field and
> > > convert it in the getDefaultEdgeDataSet() method in Scala. It's not
> > > nice, but copying would duplicate the data and make it easier for it
> > > to go out of sync in the Java and Scala versions.
> > >
> > > What do the others think? This will probably occur in all the examples.
> > >
> > > Cheers,
> > > Aljoscha
> > >
> > > On Sun, Sep 7, 2014 at 10:04 PM, Vasiliki Kalavri
> > > <vasilikikalavri@gmail.com> wrote:
> > >> Hey,
> > >>
> > >> I have ported the Connected Components example, but I am not sure how
> to
> > >> reuse the example input data from java-examples.
> > >> In the ConnectedComponentsData class, the vertices and edges data are
> > >> produced by the methods getDefaultVertexDataSet()
> > >> and getDefaultEdgeDataSet(), which take
> > >> an org.apache.flink.api.java.ExecutionEnvironment as parameter.
> > >>
> > >> One way is to provide public static fields (like in the WordCountData
> > >> class), but this introduces a conversion
> > >> from org.apache.flink.api.java.tuple.Tuple2 to Scala tuple and from
> > >> java.lang.Long to scala.Long and I guess this is an unnecessary
> > complexity
> > >> for an example (?).
> > >> Another way is, of course, to copy the example data in the Scala
> > example.
> > >>
> > >> Am I missing something here?
> > >>
> > >> Thanks!
> > >>
> > >> Cheers,
> > >> V.
> > >>
> > >>
> > >> On 5 September 2014 15:52, Aljoscha Krettek <aljoscha@apache.org>
> > wrote:
> > >>
> > >>> Alright, I updated my repo:
> > >>> https://github.com/aljoscha/incubator-flink/commits/scala-rework
> > >>>
> > >>> This now has a working WordCount example. It's pretty much a copy of
> > >>> the Java example with some fixups for the syntax and lambda
> functions.
> > >>> You'll also notice that I added the java-examples as a dependency for
> > >>> the scala-examples. I did this to reuse the example input data.
> > >>>
> > >>> When you ported a program you can do a pull request against my repo
> > >>> and I will collect the examples.
> > >>>
> > >>> Happy coding. :D
> > >>>
> > >>> On Fri, Sep 5, 2014 at 12:19 PM, Hermann Gábor <reckoner42@gmail.com
> >
> > >>> wrote:
> > >>> > +1
> > >>> >
> > >>> > ComputeEdgeDegrees for me!
> > >>> >
> > >>> >
> > >>> > On Fri, Sep 5, 2014 at 11:44 AM, Márton Balassi <
> > >>> balassi.marton@gmail.com>
> > >>> > wrote:
> > >>> >
> > >>> >> +1
> > >>> >>
> > >>> >> BatchGradientDescent for me :)
> > >>> >>
> > >>> >>
> > >>> >> On Fri, Sep 5, 2014 at 11:15 AM, Kostas Tzoumas <
> > ktzoumas@apache.org>
> > >>> >> wrote:
> > >>> >>
> > >>> >> > +1
> > >>> >> >
> > >>> >> > I go for WebLogAnalysis.
> > >>> >> >
> > >>> >> > My experience with Scala consists of going through a
tutorial so
> > this
> > >>> >> will
> > >>> >> > be a good stress test both for me and the new API :-)
> > >>> >> >
> > >>> >> >
> > >>> >> > On Thu, Sep 4, 2014 at 9:09 PM, Vasiliki Kalavri <
> > >>> >> > vasilikikalavri@gmail.com>
> > >>> >> > wrote:
> > >>> >> >
> > >>> >> > > +1 for having other people implement the examples!
> > >>> >> > > Connected Components and Kmeans for me :)
> > >>> >> > >
> > >>> >> > > -V.
> > >>> >> > >
> > >>> >> > >
> > >>> >> > > On 4 September 2014 21:03, Fabian Hueske <fhueske@apache.org>
> > >>> wrote:
> > >>> >> > >
> > >>> >> > > > I go for TriangleEnumeration and PageRank.
> > >>> >> > > >
> > >>> >> > > > Let's also do the examples similar to the Java
examples:
> > >>> >> > > > - running out-of-the-box without parameters
> > >>> >> > > > - parameters for external data
> > >>> >> > > > - follow a similar code structure
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > >
> > >>> >> > > > 2014-09-04 20:56 GMT+02:00 Aljoscha Krettek
<
> > aljoscha@apache.org
> > >>> >:
> > >>> >> > > >
> > >>> >> > > > > Will do, then people can reserve their
favourite examples
> > here.
> > >>> >> > > > >
> > >>> >> > > > > On Thu, Sep 4, 2014 at 8:55 PM, Fabian
Hueske <
> > >>> fhueske@apache.org>
> > >>> >> > > > wrote:
> > >>> >> > > > > > Hi,
> > >>> >> > > > > >
> > >>> >> > > > > > I think having examples implemented
by different people
> > >>> proved to
> > >>> >> > be
> > >>> >> > > > > > valuable in the past.
> > >>> >> > > > > > I'd help with two or three examples.
> > >>> >> > > > > >
> > >>> >> > > > > > It might be helpful if you'd port
a simple first one
> such
> > as
> > >>> >> > > WordCount.
> > >>> >> > > > > >
> > >>> >> > > > > > Fabian
> > >>> >> > > > > >
> > >>> >> > > > > >
> > >>> >> > > > > > 2014-09-04 18:47 GMT+02:00 Aljoscha
Krettek <
> > >>> aljoscha@apache.org
> > >>> >> >:
> > >>> >> > > > > >
> > >>> >> > > > > >> Hi,
> > >>> >> > > > > >> I have a working rewrite of the
Scala API here:
> > >>> >> > > > > >>
> > >>> >> https://github.com/aljoscha/incubator-flink/commits/scala-rework
> > >>> >> > > > > >>
> > >>> >> > > > > >> I'm hoping that I'll only have
to write the tests and
> > port
> > >>> the
> > >>> >> > > > > >> examples. Do you think it makes
sense to let other
> people
> > >>> port
> > >>> >> the
> > >>> >> > > > > >> examples, so that someone else
uses it and maybe
> notices
> > some
> > >>> >> > quirks
> > >>> >> > > > > >> in the API?
> > >>> >> > > > > >>
> > >>> >> > > > > >> Cheers,
> > >>> >> > > > > >> Aljoscha
> > >>> >> > > > > >>
> > >>> >> > > > >
> > >>> >> > > >
> > >>> >> > >
> > >>> >> >
> > >>> >>
> > >>>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message