Could you post a minimal example of you= r code where the problem is reproducible? I assume that there has to be ano= ther problem because env.execute should actually tr= igger the execution.

Cheers,=

Till

=E2=80=8B

On Thu, O= ct 8, 2015 at 8:58 PM, Florian Heyl <f.heyl@gmx.de> wrote:

Hey S= tephan and Pieter,

That was the same what I thought, so I simply = changed the code like this:

original.writ=
eAsCsv(outputPath, "\n", " ", WriteMode.OVERWRITE)

env.execute()

transformPred.writeAsCsv(outputPath2, "\n"=
;, " ", WriteMode.OVERWRITE)

env.execute()

But he still not execute the two= commands.

Thank you for your time.

<= div>

Flo

Am 08.10.2015 um 17:41 schrieb Stephan Ewen <sewen@apache.org>:

<= blockquote type=3D"cite">

Yes, sinks in Flink are lazy and = do not trigger execution automatically. We made this choice to allow multip= le concurrent sinks (spitting the streams and writing to many outputs concu= rrently). That requires explicit execution triggers (env.execute()).

The exceptions are, as mentioned, the "eager" method= s "collect()", "count()" and "print()". They = need to be eager, because the driver program needs for example the "co= unt()" value before it can possibly progress...

Stephan

On Thu, Oct 8, 2015 at 5:22 PM, Pieter Hameete <phame= ete@gmail.com> wrote:

Hi Florian,

I believe that when you call=C2=A0JoinPredictionAndOriginal.= collect the environment will execute your program up until that point. = The Csv writes are after this point, so in order to execute these steps I t= hink you would have to call <env>.execute()=C2=A0after the Csv= writes to trigger the execution (where <env> is the name of the vari= able pointing to your ExecutionEnvironment).

I hope this helps :-)

- Pieter
<= /font>

2015-10-08 14:54 GMT+02:00 Florian Heyl <<= a href=3D"mailto:f.heyl@gmx.de" target=3D"_blank">f.heyl@gmx.de>:
= Hi,
I need some help to figure out why one method of mine in a pipeline= stops the execution on the hdfs.
I am working with the 10.0-SNAP= SHOT and the code is the following (see below). The method stops on the hdf= s by calling the collect method (JoinPredictionAndOriginal.collect) creatin= g a data sink, which is why the program stops before the two output files a= t the ends can be created. What am I missing?
Thank you for your = time.

Best wishes,
Flo

= // method calculates the pr= ediction error
def= CalcPredError(predictions: DataSet[LabeledVector], original: DataSe= t[LabeledVector],
outputPath: String<= /span>, outputPath2: String, outputPat= h3: String): (DataSet[LabeledVector], = Double) =3D{

var iter =3D 0

val transformPred = =3D predictions
.map { tuple =3D>
iter =3D iter + 1
= LabeledVector(iter, DenseVector(BigDecimal(tuple.label).setScale(0, BigDecimal= .RoundingMode.HALF_U= P).toDouble))
}

iter =3D 0

val tranformOrg =3D original
.map {= tuple =3D>
iter =3D iter + 1
LabeledVector(iter, DenseVect= or(tuple.label))
}

val Joi= nPredictionAndOriginal =3D transformPred.join(tranformOrg).where(0).equalTo(0) {
(l, r) =3D> (l.vector.head._2, r.vector.head._2)
}
<= br> val list_JoinPredictionAndOriginal = =3D JoinPredictionAndOriginal.collect

v= al N =3D list_JoinPredictionAndOriginal.length

val residualSum =3D list_JoinPredictionAndOriginal.ma= p {
num =3D> pow((num._1= - num._2), 2)
}.sum

val predictionError =3D sqrt(residualSum / N)

original.writeAsCsv(outputPath, "\n", " ", WriteMode= .OVERWRITE)
tra= nsformPred.writeAsCsv(outputPath2, "\n", " ", WriteMode.OVERWRITE)

= (predictions,predictionError)
}

=C2=A0