beam-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Thomas Weise <...@apache.org>
Subject Re: Portable wordcount on Flink runner broken
Date Mon, 19 Nov 2018 03:53:26 GMT
With latest master the problem seems fixed. Unfortunately that was first
masked by build and docker issues. But I changed multiple things at once
after getting nowhere (the container build "succeeded" when in fact it did
not):

* Update to latest docker
* Increase docker disk space after seeing a spurious, non-reproducible
message in one of the build attempts
* Full clean and manually remove Go build residuals from the workspace

After that I could see Go and container builds execute differently (longer
build time) and the result certainly looks better..

HTH,
Thomas





On Sun, Nov 18, 2018 at 2:11 PM Ruoyun Huang <ruoyun@google.com> wrote:

> I was after the same issue (I was using reference runner job server, but
> same error message), had some clue but no conclusion yet.
>
> By retaining the container instance, error message says "bad MD5" (see the
> other thread [1] I asked in dev last week). My hypothesis, based on the
> symptoms, is that the underlying container expects an MD5 to validate
> staged files, but job request from python SDK does not send file hash
> code.  Hope someone can confirm if that is the case (I am still trying to
> understand how come dataflow does not have such issue), and if so, the best
> way to fix it.
>
>
> [1]
> https://lists.apache.org/thread.html/b26560087ff88f142e26d66c8a5a9283558c8e55b5edd705b5e53c9c@%3Cdev.beam.apache.org%3E
>
> On Fri, Nov 16, 2018 at 7:06 PM Thomas Weise <thw@apache.org> wrote:
>
>> Since last few days, the steps under
>> https://beam.apache.org/roadmap/portability/#python-on-flink are broken.
>>
>> The gradle task hangs because the job server isn't able to launch the
>> docker container.
>>
>> ./gradlew :beam-sdks-python:portableWordCount -PjobEndpoint=localhost:8099
>>
>> [CHAIN MapPartition (MapPartition at
>> 36write/Write/WriteImpl/DoOnce/Impulse.None/beam:env:docker:v1:0) ->
>> FlatMap (FlatMap at
>> 36write/Write/WriteImpl/DoOnce/Impulse.None/beam:env:docker:v1:0/out.0)
>> (8/8)] INFO
>> org.apache.beam.runners.fnexecution.environment.DockerEnvironmentFactory -
>> Still waiting for startup of environment
>> tweise-docker-apache.bintray.io/beam/python:latest for worker id 1
>>
>> Unfortunately this isn't covered by tests yet. Is anyone aware what
>> change may have caused this or looking into resolving it?
>>
>> Thanks,
>> Thomas
>>
>>
>
> --
> ================
> Ruoyun  Huang
>
>

Mime
View raw message