mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianqi Chen <tqc...@cs.washington.edu>
Subject Re: Adding AMD CPU to CI
Date Fri, 30 Nov 2018 17:19:14 GMT
I still think it is overkill to add AMD CPU to the CI, given the additional
cost it could bring and little additional information we can get out from
it.

A middle group is to add AMD CPU to a nightly build or final sweep before
release. If there is a case that we find that AMD CPU really makes a
difference, then we add it to the CI

Tianqi

On Thu, Nov 29, 2018 at 6:29 PM Hao Jin <hjjn.amzn@gmail.com> wrote:

> For CPUs, the supported instruction sets may also vary between the same
> manufacturer's different product lines of the same generation (Skylake-SP
> versus Skylake).
> For the same instruction set, the two manufacturers should both have a
> working version of the hardware implementation. If any of the
> implementations does not work, then the chip would not even be considered
> functioning properly.
> If some AMD CPUs only support up to AVX2 instruction sets, they would just
> function in the same way as an Intel CPU that supports up to AVX2
> instruction sets. The performance may vary, but the capability and behavior
> of the two chips would be the same when given the same machine code.
> For AMD GPUs it's a totally different story, as AMD GPUs do not share the
> same instruction sets with the NVIDIA ones, thus testing on AMD GPUs(if we
> do have support for them) would definitely add values.
> Hao
>
> On Thu, Nov 29, 2018 at 8:37 PM Anirudh Subramanian <anirudh2290@gmail.com
> >
> wrote:
>
> > Instruction set extensions support like AVX2, AVX512 etc. can vary
> between
> > AMD and Intel and there can also be a time lag between when Intel
> supports
> > it versus when AMD supports it.
> > Also, in the future this setup may be useful in case MXNet supports AMD
> > GPUs and AWS also happens to have support for it.
> >
> > Anirudh
> >
> >
> > On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu
> > <marco.g.abreu@googlemail.com.invalid> wrote:
> >
> > > I think it's worth a discussion to do a sanity check. While generally
> > these
> > > instructions are standardized, we also made the experience with ARM
> that
> > > the theory and reality sometimes don't match. Thus, it's always good to
> > > check.
> > >
> > > In the next months we are going to refactor our slave creation
> processes.
> > > Chance Bair has been working on rewriting Windows slaves from scratch
> (we
> > > used images that haven't really been updated for 2 years - we still
> don't
> > > know what was done on them) and they're ready soon. In the following
> > > months, we will also port our Ubuntu slaves to the new method (don't
> > have a
> > > timeline yet). Ideally, the integration of AMD instances will only be a
> > > matter of running the same pipeline on a different instance type. In
> that
> > > Case, it should not be a big deal.
> > >
> > > If there are big differences, that's already a yellow flag for
> > > compatibility, but that's unlikely. But in that case, we would have to
> > make
> > > a more thorough time analysis and whether it's worth the effort. Maybe,
> > > somebody else could also lend us a hand and help us with adding AMD
> > > support.
> > >
> > > -Marco
> > >
> > > Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <hjjn.amzn@gmail.com>
> > > geschrieben:
> > >
> > > > f16c is also an instruction set supported by both brands' recent CPUs
> > > just
> > > > like x86, AVX, SSE etc., and any difference in behaviors (quite
> > > impossible
> > > > to happen or it will be a major defect) would most likely be caused
> by
> > > the
> > > > underlying hardware implementation, so still, adding AMD instances is
> > not
> > > > adding much value here.
> > > > Hao
> > > >
> > > > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland <
> > > > kellen.sunderland@gmail.com> wrote:
> > > >
> > > > > Just looked at the mf16c work and wanted to mention Rahul clearly
> > _was_
> > > > > thinking about AMD users in that PR.
> > > > >
> > > > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland <
> > > > > kellen.sunderland@gmail.com> wrote:
> > > > >
> > > > > > From my perspective we're developing a few features like mf16c
> and
> > > > MKLDNN
> > > > > > integration specifically for Intel CPUs.  It wouldn't hurt to
> make
> > > sure
> > > > > > those changes also run properly on AMD cpus.
> > > > > >
> > > > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin <hjjn.amzn@gmail.com
> wrote:
> > > > > >
> > > > > >> I'm a bit confused about why we need extra functionality
tests
> > just
> > > > for
> > > > > >> AMD
> > > > > >> CPUs, aren't AMD CPUs supporting roughly the same instruction
> sets
> > > as
> > > > > the
> > > > > >> Intel ones? In the very impossible case that something working
> on
> > > > Intel
> > > > > >> CPUs being not functioning on AMD CPUs (or vice versa),
it would
> > > > mostly
> > > > > >> likely be related to the underlying hardware implementation
of
> the
> > > > same
> > > > > >> ISA, to which we definitely do not have a good solution.
So I
> > don't
> > > > > think
> > > > > >> performing extra tests on functional aspect of the system
on AMD
> > > CPUs
> > > > is
> > > > > >> adding any values.
> > > > > >> Hao
> > > > > >>
> > > > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu
> > > <sethman@amazon.com.invalid
> > > > >
> > > > > >> wrote:
> > > > > >>
> > > > > >> > +1
> > > > > >> >
> > > > > >> > ´╗┐On 11/29/18, 2:39 PM, "Alex Zai" <azai91@gmail.com>
wrote:
> > > > > >> >
> > > > > >> >     What are people's thoughts on having AMD machines
tested
> on
> > > the
> > > > > CI?
> > > > > >> AMD
> > > > > >> >     machines are now available on AWS.
> > > > > >> >
> > > > > >> >     Best,
> > > > > >> >     Alex
> > > > > >> >
> > > > > >> >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message