beam-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Reuven Lax <re...@google.com>
Subject Re: Graal instead of docker?
Date Sat, 05 May 2018 07:11:36 GMT
A beam cluster with the spark runner would include a spark cluster, plus
what's needed for portability, plus the beam sdk.

On Fri, May 4, 2018, 11:55 PM Romain Manni-Bucau <rmannibucau@gmail.com>
wrote:

>
>
> Le 5 mai 2018 08:43, "Reuven Lax" <relax@google.com> a écrit :
>
> I don't believe we enforce docker anywhere. In fact if someone wanted to
> run an all-windows beam cluster, they would probably not use docker for
> their runner (docker runs on Windows, but not efficiently).
>
>
>
> Or doesnt run sometimes - a colleague hit that yesterday :(.
>
> What is a "beam cluster" - opposed to a spark or foink cluster? How would
> it work on windows servers?
>
>
> On Fri, May 4, 2018, 11:19 PM Romain Manni-Bucau <rmannibucau@gmail.com>
> wrote:
>
>>
>>
>> 2018-05-05 2:33 GMT+02:00 Andrew Pilloud <apilloud@google.com>:
>>
>>> What docker really buys is a package format and runtime environment that
>>> is language and operating system agnostic. The docker packaging and
>>> runtime format is the de facto standard for portable applications such as
>>> this, and there is a group trying to turn it into an actual standard.
>>>
>>> I would agree with you that dockerd has become bloated but there are
>>> projects that solve that. There is no longer lock-in to dockerd, there
>>> are package format compatible docker replacements that eliminate the
>>> performance issues and overhead associated with docker. CRI-O (
>>> https://github.com/kubernetes-incubator/cri-o) is a really cool RedHat
>>> project which is a minimalist replacement for docker. I was recently
>>> working at a startup where I migrated our "data mover" appliance from
>>> Docker to CRI-O. Our application was able to get direct access to the
>>> ethernet driver and block devices which enabled a huge performance boost
>>> but we were also able to run containers produced by docker without
>>> modification.
>>>
>>> You mention that docker is "detail of one runner+vendor corrupting all
>>> the project and adding complexity and work to everyone". It sounds like
>>> you have a specific example you'd like to share? Is there a runner that is
>>> unable to move to portability because of docker?
>>>
>>
>> IBM one for instance, some custom ones like an hazelcast based one,
>> etc... More generally any runner developped outside beam itself - even if
>> we take a snapshot today, most of beam's ones have the same pitall.
>>
>> Note: i never said docker was a bad techno or so. Let me try to clarify.
>>
>> Main issue is that you enforce docker usage which is still trendy. It is
>> like scla which was promishing to kill java, check what it does today...
>> It starts to be tooled but it is also very impacting on the deployment
>> side and for a good number of beam users who deploy it outside the cloud it
>> is an issue.
>> Keep in mind beam is embeddable by design, it is not a runner environment
>> and with the docker choice it imposes some environment which is
>> inconsistent with beam design itself and this is where this choice blocks.
>>
>>
>>>
>>> Andrew
>>>
>>> On Fri, May 4, 2018 at 4:32 PM Henning Rohde <herohde@google.com> wrote:
>>>
>>>> Romain,
>>>>
>>>> Docker, unlike selinux, solves a great number of tangible problems for
>>>> us with IMO a relatively small tax. It does not have to be the only way.
>>>> Some of the concerns you bring up along with possibilities were also
>>>> discussed here: https://s.apache.org/beam-fn-api-container-contract. I
>>>> encourage you to take a look.
>>>>
>>>> Thanks,
>>>>  Henning
>>>>
>>>>
>>>> On Fri, May 4, 2018 at 3:18 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> Le 4 mai 2018 21:31, "Henning Rohde" <herohde@google.com> a écrit
:
>>>>>
>>>>> I disagree with the characterization of docker and the implications
>>>>> made towards portability. Graal looks like a neat project (and I
>>>>> never thought I would live to see the phrase "Practical Partial Evaluation"
>>>>> ..), but it doesn't address the needs of portability. In addition to
Luke's
>>>>> examples, Go and most other languages don't work on it either. Docker
>>>>> containers also address packaging, OS dependencies, conflicting versions
>>>>> and distribution aspects in addition to truly universal language support.
>>>>>
>>>>>
>>>>> This is wrong, docker also has its conflicts, is not universal (fails
>>>>> on windows and mac easily - as host or not, cloud vendors put layers
>>>>> limiting or corrupting it, and it is an infra constraint imposed and
a
>>>>> vendor locking not welcomed in beam IMHO).
>>>>>
>>>>> This is my main concern. All the work done looks like an
>>>>> implemzntation detail of one runner+vendor corrupting all the project
and
>>>>> adding complexity and work to everyone instead of keeping it localised
>>>>> (technically it is possible).
>>>>>
>>>>> Would you accept i enforce you to use selinux? Using docker is the
>>>>> same kind of constraint.
>>>>>
>>>>>
>>>>> That said, it's entirely fine for some runners to use Jython, Graal,
>>>>> etc to provide a specialized offering similar to the direct runners,
but it
>>>>> would be disjoint from portability IMO.
>>>>>
>>>>> On Fri, May 4, 2018 at 10:14 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> Le 4 mai 2018 17:55, "Lukasz Cwik" <lcwik@google.com> a écrit
:
>>>>>>
>>>>>> I did take a look at Graal a while back when thinking about how
>>>>>> execution environments could be defined, my concerns were related
to it not
>>>>>> supporting all of the features of a language.
>>>>>> For example, its typical for Python to load and call native libraries
>>>>>> and Graal can only execute C/C++ code that has been compiled to LLVM.
>>>>>> Also, a good amount of people interested in using ML libraries will
>>>>>> want access to GPUs to improve performance which I believe that Graal
can't
>>>>>> support.
>>>>>>
>>>>>> It can be a very useful way to run simple lamda functions written
in
>>>>>> some language directly without needing to use a docker environment
but you
>>>>>> could probably use something even lighter weight then Graal that
is
>>>>>> language specific like Jython.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Right, the jsr223 impl works very well but you can also have a perf
>>>>>> boost using native (like v8 java binding for js for instance). It
is way
>>>>>> more efficient than docker most of the time and not code intrusive
at all
>>>>>> in runners so likely more adoption-able and maintainable. That said
all is
>>>>>> doable behind the jsr223 so maybe not a big deal in terms of api.
We just
>>>>>> need to ensure portability work stay clean and actually portable
and doesnt
>>>>>> impact runners as poc done until today did.
>>>>>>
>>>>>> Works for me.
>>>>>>
>>>>>>
>>>>>> On Thu, May 3, 2018 at 10:05 PM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> Hi guys
>>>>>>>
>>>>>>> Since some time there are efforts to have a language portable
>>>>>>> support in beam but I cant really find a case it "works" being
based on
>>>>>>> docker except for some vendor specific infra.
>>>>>>>
>>>>>>> Current solution:
>>>>>>>
>>>>>>> 1. Is runner intrusive (which is bad for beam and prevents adoption
>>>>>>> of big data vendors)
>>>>>>> 2. Based on docker (which assumed a runtime environment and is
very
>>>>>>> ops/infra intrusive and likely too $$ quite often for what it
brings)
>>>>>>>
>>>>>>> Did anyone had a look to graal which seems a way to make the
feature
>>>>>>> doable in a lighter manner and optimized compared to default
jsr223 impls?
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>
>

Mime
View raw message