mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Larroy <pedro.larroy.li...@gmail.com>
Subject Re: Cambricon MLU support for MXNet.
Date Mon, 17 Dec 2018 13:49:55 GMT
Hi Haochong

Welcome to MXNet, It's exciting to have additional hardware platforms
added and supported in the project.

The CI system for MXNet is donated by AWS to the project. We have a
small hardware lab with embedded physical hardware like ARM boards
including NVidia Jetson which we are connecting to the CI system.
(It's a WIP).

However, the bulk of the CI system runs in the AWS Cloud using Jenkins
and EC2 GPU and CPU instances. So even though any of the options you
mention are possible and could work, I think in the order you
mentioned them would be the most preferable. Connecting a remote
server or cloud instance to the MXNet Jenkins would be the easiest
which wouldn't involve hardware shipping and maintenance.

I think once you have the contribution merged and the changes ready to
be tested we can make a plan on how to best integrate with CI. For
that, the recommendation that Hagay gave (Design proposal in the Wiki)
is a good path forward, so other members of the community and the
engineers contributing to the CI system can contribute.

Pedro.

On Mon, Dec 17, 2018 at 3:33 AM 张昊翀 <zhanghaochong@cambricon.com> wrote:
>
> Dear MXNet community,
>
> We are from Cambricon, a leading supplier of artificial intelligence chips. We have two
product lines, including IP products (e.g., Cambricon 1A/1H) and chip products (e.g., MLU100
released in May 2018)
>
> We are now adapting MXNet on Cambricon products. During the follow-up session, we plan
to open source, and hope to merge these new features into the master branch of MXNet and to
be a part of MXNet's long-term support. We firmly believe that these MLU features will promote
the MXNet community development.
> To this end, we are ready to accept the rigorous inspection of MXNet community. In addition,
we need advice from the community to achieve high quality implementation. On this basis, we
very much hope to reach a full-scale long-term cooperation with the community.
>
> In order to achieve the above goals, we hope to keep in touch with the community on some
issues. Looking forward to your valuable feedback.
>
> 1. MLU100 mainly focuses on inference, and we plan to first support the inference part
of MXNet. The training part of MXNet on MLU will be released in the future. Is that acceptable
for MXNet community?
>
> 2. Though MLU can support various operators/networks, to guarantee high quality, all
supported operators submitted to the community should undergo rigorous stress test. Thus,
at the beginning, we plan to release a small number of supported operators and networks, and
more of them will be continuously added. Is that acceptable or do we have to support all networks
in the ModelZoo in the first release?
>
> 3. Currently we plan to support both Python and C++ APIs. More details on supported APIs
will be provided in a follow-up proposal.
>
> 4. We need to modify the mShadow in order to support tensor memory operations.
>
> 5. In order to enable the community to run and fully test our code, we want to provide
the community with a complete test environment. At present, we are considering the following
three ways.
> A) Provides several remote servers for community and integrates with the community's
Jenkins.
> B) Provide a cloud platform to the community.
> C) Donate MLU100 to the community's testing platform. However, we don’t know the specific
ways of donation, and we hope to get help. We are wondering about how MXNet's test servers
are managed.
>
> About more technical details, a proposal will be submitted to the community before releasing
the code.
>
> In addition to the above points, the remaining questions and suggestions are also welcome.
Thanks!
>
> More about Cambricon:
> Cambricon is the artificial intelligence computing pioneer that engineers and successfully
commercializes world’s first dedicated machine learning processor. To bring its unique AI
processors from edge to cloud, enriching and advancing human life, is the firm mission of
the company. Dr. Tianshi Chen is the founder and CEO of Cambricon, where he brings over 10
years experience in the fields of micro-processor architecture and artificial intelligence.
> In 2016, Cambricon released Cambricon 1A processor, the first commercial machine learning
specific processor in the world. Later, during the 3rd World Internet Conference, Cambricon
1A processor was elected as one of “World Leading Internet Scientific and Technological
Achievements“. In May 2018, Cambricon released MLU100, a machine learning chip which is
in mass production now. By offering revolutionary technology and products, Cambricon has established
and remains active relationships with various companies in the AI industry.
>
>
> Regards,
> Haochong Zhang
> Cambricon MXNet Development Team
>
>

Mime
View raw message