mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kumar, Vikas" <viku...@amazon.com.INVALID>
Subject Re: Adding AMD CPU to CI
Date Fri, 30 Nov 2018 17:26:40 GMT
I don't think there is any downside to this proposal. I think a basic sanity CI testing on
AMD processors will give extra boost to our tests. This adds to developer productivity and
they have one less thing to worry about. Developers have spent time in past where they had
to manually test on AMD  processors, MKLDNN being the recent instance. It's good to have those
test in CI pipeline.
All I see is benefit. If the $ cost is not too high for basic sanity testing, we should do
this, until and unless some strong downside is called out.

+1
 

´╗┐On 11/29/18, 5:37 PM, "Anirudh Subramanian" <anirudh2290@gmail.com> wrote:

    Instruction set extensions support like AVX2, AVX512 etc. can vary between
    AMD and Intel and there can also be a time lag between when Intel supports
    it versus when AMD supports it.
    Also, in the future this setup may be useful in case MXNet supports AMD
    GPUs and AWS also happens to have support for it.
    
    Anirudh
    
    
    On Thu, Nov 29, 2018 at 4:29 PM Marco de Abreu
    <marco.g.abreu@googlemail.com.invalid> wrote:
    
    > I think it's worth a discussion to do a sanity check. While generally these
    > instructions are standardized, we also made the experience with ARM that
    > the theory and reality sometimes don't match. Thus, it's always good to
    > check.
    >
    > In the next months we are going to refactor our slave creation processes.
    > Chance Bair has been working on rewriting Windows slaves from scratch (we
    > used images that haven't really been updated for 2 years - we still don't
    > know what was done on them) and they're ready soon. In the following
    > months, we will also port our Ubuntu slaves to the new method (don't have a
    > timeline yet). Ideally, the integration of AMD instances will only be a
    > matter of running the same pipeline on a different instance type. In that
    > Case, it should not be a big deal.
    >
    > If there are big differences, that's already a yellow flag for
    > compatibility, but that's unlikely. But in that case, we would have to make
    > a more thorough time analysis and whether it's worth the effort. Maybe,
    > somebody else could also lend us a hand and help us with adding AMD
    > support.
    >
    > -Marco
    >
    > Am Fr., 30. Nov. 2018, 01:22 hat Hao Jin <hjjn.amzn@gmail.com>
    > geschrieben:
    >
    > > f16c is also an instruction set supported by both brands' recent CPUs
    > just
    > > like x86, AVX, SSE etc., and any difference in behaviors (quite
    > impossible
    > > to happen or it will be a major defect) would most likely be caused by
    > the
    > > underlying hardware implementation, so still, adding AMD instances is not
    > > adding much value here.
    > > Hao
    > >
    > > On Thu, Nov 29, 2018 at 7:03 PM kellen sunderland <
    > > kellen.sunderland@gmail.com> wrote:
    > >
    > > > Just looked at the mf16c work and wanted to mention Rahul clearly _was_
    > > > thinking about AMD users in that PR.
    > > >
    > > > On Thu, Nov 29, 2018 at 3:46 PM kellen sunderland <
    > > > kellen.sunderland@gmail.com> wrote:
    > > >
    > > > > From my perspective we're developing a few features like mf16c and
    > > MKLDNN
    > > > > integration specifically for Intel CPUs.  It wouldn't hurt to make
    > sure
    > > > > those changes also run properly on AMD cpus.
    > > > >
    > > > > On Thu, Nov 29, 2018, 3:38 PM Hao Jin <hjjn.amzn@gmail.com wrote:
    > > > >
    > > > >> I'm a bit confused about why we need extra functionality tests
just
    > > for
    > > > >> AMD
    > > > >> CPUs, aren't AMD CPUs supporting roughly the same instruction
sets
    > as
    > > > the
    > > > >> Intel ones? In the very impossible case that something working
on
    > > Intel
    > > > >> CPUs being not functioning on AMD CPUs (or vice versa), it would
    > > mostly
    > > > >> likely be related to the underlying hardware implementation of
the
    > > same
    > > > >> ISA, to which we definitely do not have a good solution. So I
don't
    > > > think
    > > > >> performing extra tests on functional aspect of the system on AMD
    > CPUs
    > > is
    > > > >> adding any values.
    > > > >> Hao
    > > > >>
    > > > >> On Thu, Nov 29, 2018 at 5:50 PM Seth, Manu
    > <sethman@amazon.com.invalid
    > > >
    > > > >> wrote:
    > > > >>
    > > > >> > +1
    > > > >> >
    > > > >> > On 11/29/18, 2:39 PM, "Alex Zai" <azai91@gmail.com>
wrote:
    > > > >> >
    > > > >> >     What are people's thoughts on having AMD machines tested
on
    > the
    > > > CI?
    > > > >> AMD
    > > > >> >     machines are now available on AWS.
    > > > >> >
    > > > >> >     Best,
    > > > >> >     Alex
    > > > >> >
    > > > >> >
    > > > >> >
    > > > >>
    > > > >
    > > >
    > >
    >
    

Mime
View raw message