crail-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Stuedi <pstu...@gmail.com>
Subject Re: [zrlio-users] Getting Crail to work over TCP
Date Wed, 21 Aug 2019 05:42:54 GMT
There is a bug currently in NaRPC which increases the likelyhood of hangs
in Crail/TCP as the data sizes increase. We have identified the actual
problem in NaRPC but didn't get to fixing it so far. I can look into this.

-Patrick

On Wed, Aug 21, 2019 at 12:35 AM 'Ben Sidhom' via zrlio-users <
zrlio-users@googlegroups.com> wrote:

> I've been experimenting with getting Crail over TCP to work with the
> crail-spark-io <https://github.com/zrlio/crail-spark-io> shuffle
> extensions.
>
> It seems to work fine for small shuffle sizes (up to about 10 gigabytes),
> but anything larger than that seems to hang. I've investigated this and the
> hangs seem to happen due to a few reasons, mostly contained to the NaRPC
> layer.
>
> The benchmark numbers here
> <https://crail.incubator.apache.org/blog/2019/03/disaggregation.html> seem
> to imply that this has worked for at least 200 gigabyte shuffles (I'm not
> certain because that second experiment does not explicitly give the test
> parameters). Has anybody had success with Crail over TCP or were pretty
> much all of the tests run over RDMA/NVMe?
>
> --
> -Ben
>
> --
> You received this message because you are subscribed to the Google Groups
> "zrlio-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to zrlio-users+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/zrlio-users/CA%2B%2BvPmYD0UXwpnaEYNxGsRj3uNpAeubzHA6Sjy3AXT82-kuh-g%40mail.gmail.com
> <https://groups.google.com/d/msgid/zrlio-users/CA%2B%2BvPmYD0UXwpnaEYNxGsRj3uNpAeubzHA6Sjy3AXT82-kuh-g%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message