flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Maximilian Alber <alber.maximil...@gmail.com>
Subject Re: Base Scala Code for Flink 0.7
Date Fri, 26 Sep 2014 10:59:06 GMT
Hi,

Thanks.

Something different regarding file input.

My former code was:

val Y = DataSource(config.yFile, DelimitedInputFormat[Vector]{ x: String =>
Vector.parseFromString(1, x)})

with:

object Vector {
    def parseFromString(s: String) = {
      Vector.parseFromString(1, s)
    }

    def parseFromString(dimension: Int, s: String) = {
      val tokens = s split "\\s" filter { _.trim().size > 0 }
      val id = tokens(0).toInt
      val values = tokens drop 1 map { _ toFloat }

      assert(values.size == dimension)

      new Vector(id, values)
    }
}

Thus my input is heterogeneous (integer and floats). In the new version the
DataSource class is gone as DelimitedInputFormat is. Is the code just not
added yet or is there another way to do it now?
F.e. I could in imagine to read it first as CSV with Float base type and
afterwards creating the Vector objects with a map function.


Cheers,
Max




On Fri, Sep 26, 2014 at 12:17 PM, Aljoscha Krettek <aljoscha@apache.org>
wrote:

> Hi,
> yes, there are two ways to go. In the past we had PactProgram with the
> getPlan method. This is not necessary anymore in the new version.
>
> With the new version the code is simply put in a main method. Instead
> of creating a Scala Plan you can simply call env.execute() to execute
> the program.
>
> The basic program skeleton is this:
>
> object MyProgram {
>   def main(args: Array[String]) {
>     val env = ExecutionEnvironment.getExecutionEnvironment
>
>     val text = env.readTextFile(...)
>     val mapped = text.map {...}
>
>     // other operations
>
>     mapped.writeAsText() // or writeAsCsv(), or something else
>
>    env.execute("My Program")
>   }
>
> Cheers,
> Aljoscha
>
> On Fri, Sep 26, 2014 at 12:06 PM, Maximilian Alber
> <alber.maximilian@gmail.com> wrote:
> > Hi!
> >
> > Thanks for the quick help!
> > Seems I have to change the code a bit for the migration. Or basically
> > substitute new ScalaPlan with env.createProgramPlan :-)
> >
> > Cheers,
> > Max
> >
> > On Fri, Sep 26, 2014 at 11:06 AM, Aljoscha Krettek <aljoscha@apache.org>
> > wrote:
> >>
> >> Hi,
> >> since the 0.7 is just the latest snapshot release the quickstart
> >> script is not yet updated to create a 0.7 project.
> >>
> >> If you want, you can create a 0.7 quickstart using:
> >> curl
> >>
> https://raw.githubusercontent.com/apache/incubator-flink/master/flink-quickstart/quickstart-scala-SNAPSHOT.sh
> >> | bash
> >>
> >> There are complete Scala API examples here:
> >> http://flink.incubator.apache.org/docs/0.7-incubating/examples.html.
> >> Just click the Scala Tab.
> >>
> >> Let us know if you need more information.
> >>
> >> Cheres,
> >> Aljoscha
> >>
> >> On Fri, Sep 26, 2014 at 10:20 AM, Maximilian Alber
> >> <alber.maximilian@gmail.com> wrote:
> >> > Hi Flinkers,
> >> >
> >> > I tried to migrate to 0.7 for new features aka Broadcast Variables in
> >> > Scala.
> >> > Unfortunately the code structure seemed to have changed:
> >> >
> >> > [INFO] Compiling 2 source files to
> >> >
> >> >
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/target/classes
> >> > at 1411719085848
> >> > [ERROR]
> >> >
> >> >
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:7:
> >> > error: object TextFile is not a member of package
> >> > org.apache.flink.api.scala
> >> > [ERROR] import org.apache.flink.api.scala.TextFile
> >> > [ERROR] ^
> >> > [ERROR]
> >> >
> >> >
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:8:
> >> > error: object ScalaPlan is not a member of package
> >> > org.apache.flink.api.scala
> >> > [ERROR] import org.apache.flink.api.scala.ScalaPlan
> >> > [ERROR] ^
> >> > [ERROR]
> >> >
> >> >
> /media/alber_disk/alber/Arbeit/Uni/Kurse/MA/repo/flink/bump_boost/src/main/scala/bumpboost/BumpBoost.scala:13:
> >> > error: object functions is not a member of package
> >> > org.apache.flink.api.scala
> >> > [ERROR] import org.apache.flink.api.scala.functions.MapFunction
> >> >
> >> >
> >> > Thus I tried to examine the new quickstart code on
> >> >
> >> >
> https://flink.incubator.apache.org/docs/0.7-incubating/scala_api_quickstart.html
> >> > Unfortunately the quickstart script still creates a project for 0.6
> and
> >> > using Maven results in an error too:
> >> >
> >> > [INFO] --- maven-archetype-plugin:2.2:generate (default-cli) @
> >> > standalone-pom ---
> >> > [INFO] Generating project in Interactive mode
> >> > [INFO] Archetype repository missing. Using the one from
> >> > [org.apache.flink:flink-quickstart-scala:0.6.1-incubating] found in
> >> > catalog
> >> > remote
> >> > Downloading:
> >> >
> >> >
> http://repo.maven.apache.org/maven2/org/apache/flink/flink-quickstart-scala/0.7-incubating/flink-quickstart-scala-0.7-incubating.jar
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> > [INFO] BUILD FAILURE
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> > [INFO] Total time: 18.857s
> >> > [INFO] Finished at: Fri Sep 26 10:17:49 CEST 2014
> >> > [INFO] Final Memory: 12M/104M
> >> > [INFO]
> >> >
> ------------------------------------------------------------------------
> >> > [ERROR] Failed to execute goal
> >> > org.apache.maven.plugins:maven-archetype-plugin:2.2:generate
> >> > (default-cli)
> >> > on project standalone-pom: The desired archetype does not exist
> >> > (org.apache.flink:flink-quickstart-scala:0.7-incubating) -> [Help 1]
> >> >
> >> >
> >> > Is there some explanation how to move code from 0.6 to 0.7? Or an
> >> > example
> >> > which creates a Plan?
> >> >
> >> > Thanks!
> >> > Cheers,
> >> > Max
> >> >
> >> >
> >
> >
>

Mime
View raw message