beam-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reuven Lax <>
Subject Re: Graal instead of docker?
Date Sat, 05 May 2018 06:43:13 GMT
I don't believe we enforce docker anywhere. In fact if someone wanted to
run an all-windows beam cluster, they would probably not use docker for
their runner (docker runs on Windows, but not efficiently).

On Fri, May 4, 2018, 11:19 PM Romain Manni-Bucau <>

> 2018-05-05 2:33 GMT+02:00 Andrew Pilloud <>:
>> What docker really buys is a package format and runtime environment that
>> is language and operating system agnostic. The docker packaging and
>> runtime format is the de facto standard for portable applications such as
>> this, and there is a group trying to turn it into an actual standard.
>> I would agree with you that dockerd has become bloated but there are
>> projects that solve that. There is no longer lock-in to dockerd, there
>> are package format compatible docker replacements that eliminate the
>> performance issues and overhead associated with docker. CRI-O (
>> is a really cool RedHat
>> project which is a minimalist replacement for docker. I was recently
>> working at a startup where I migrated our "data mover" appliance from
>> Docker to CRI-O. Our application was able to get direct access to the
>> ethernet driver and block devices which enabled a huge performance boost
>> but we were also able to run containers produced by docker without
>> modification.
>> You mention that docker is "detail of one runner+vendor corrupting all
>> the project and adding complexity and work to everyone". It sounds like
>> you have a specific example you'd like to share? Is there a runner that is
>> unable to move to portability because of docker?
> IBM one for instance, some custom ones like an hazelcast based one, etc...
> More generally any runner developped outside beam itself - even if we take
> a snapshot today, most of beam's ones have the same pitall.
> Note: i never said docker was a bad techno or so. Let me try to clarify.
> Main issue is that you enforce docker usage which is still trendy. It is
> like scla which was promishing to kill java, check what it does today...
> It starts to be tooled but it is also very impacting on the deployment
> side and for a good number of beam users who deploy it outside the cloud it
> is an issue.
> Keep in mind beam is embeddable by design, it is not a runner environment
> and with the docker choice it imposes some environment which is
> inconsistent with beam design itself and this is where this choice blocks.
>> Andrew
>> On Fri, May 4, 2018 at 4:32 PM Henning Rohde <> wrote:
>>> Romain,
>>> Docker, unlike selinux, solves a great number of tangible problems for
>>> us with IMO a relatively small tax. It does not have to be the only way.
>>> Some of the concerns you bring up along with possibilities were also
>>> discussed here: I
>>> encourage you to take a look.
>>> Thanks,
>>>  Henning
>>> On Fri, May 4, 2018 at 3:18 PM Romain Manni-Bucau <>
>>> wrote:
>>>> Le 4 mai 2018 21:31, "Henning Rohde" <> a écrit
>>>> I disagree with the characterization of docker and the implications
>>>> made towards portability. Graal looks like a neat project (and I never
>>>> thought I would live to see the phrase "Practical Partial Evaluation" ..),
>>>> but it doesn't address the needs of portability. In addition to Luke's
>>>> examples, Go and most other languages don't work on it either. Docker
>>>> containers also address packaging, OS dependencies, conflicting versions
>>>> and distribution aspects in addition to truly universal language support.
>>>> This is wrong, docker also has its conflicts, is not universal (fails
>>>> on windows and mac easily - as host or not, cloud vendors put layers
>>>> limiting or corrupting it, and it is an infra constraint imposed and a
>>>> vendor locking not welcomed in beam IMHO).
>>>> This is my main concern. All the work done looks like an implemzntation
>>>> detail of one runner+vendor corrupting all the project and adding
>>>> complexity and work to everyone instead of keeping it localised
>>>> (technically it is possible).
>>>> Would you accept i enforce you to use selinux? Using docker is the same
>>>> kind of constraint.
>>>> That said, it's entirely fine for some runners to use Jython, Graal,
>>>> etc to provide a specialized offering similar to the direct runners, but
>>>> would be disjoint from portability IMO.
>>>> On Fri, May 4, 2018 at 10:14 AM Romain Manni-Bucau <
>>>>> wrote:
>>>>> Le 4 mai 2018 17:55, "Lukasz Cwik" <> a écrit
>>>>> I did take a look at Graal a while back when thinking about how
>>>>> execution environments could be defined, my concerns were related to
it not
>>>>> supporting all of the features of a language.
>>>>> For example, its typical for Python to load and call native libraries
>>>>> and Graal can only execute C/C++ code that has been compiled to LLVM.
>>>>> Also, a good amount of people interested in using ML libraries will
>>>>> want access to GPUs to improve performance which I believe that Graal
>>>>> support.
>>>>> It can be a very useful way to run simple lamda functions written in
>>>>> some language directly without needing to use a docker environment but
>>>>> could probably use something even lighter weight then Graal that is
>>>>> language specific like Jython.
>>>>> Right, the jsr223 impl works very well but you can also have a perf
>>>>> boost using native (like v8 java binding for js for instance). It is
>>>>> more efficient than docker most of the time and not code intrusive at
>>>>> in runners so likely more adoption-able and maintainable. That said all
>>>>> doable behind the jsr223 so maybe not a big deal in terms of api. We
>>>>> need to ensure portability work stay clean and actually portable and
>>>>> impact runners as poc done until today did.
>>>>> Works for me.
>>>>> On Thu, May 3, 2018 at 10:05 PM Romain Manni-Bucau <
>>>>>> wrote:
>>>>>> Hi guys
>>>>>> Since some time there are efforts to have a language portable support
>>>>>> in beam but I cant really find a case it "works" being based on docker
>>>>>> except for some vendor specific infra.
>>>>>> Current solution:
>>>>>> 1. Is runner intrusive (which is bad for beam and prevents adoption
>>>>>> of big data vendors)
>>>>>> 2. Based on docker (which assumed a runtime environment and is very
>>>>>> ops/infra intrusive and likely too $$ quite often for what it brings)
>>>>>> Did anyone had a look to graal which seems a way to make the feature
>>>>>> doable in a lighter manner and optimized compared to default jsr223

View raw message