mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Haochong Zhang" <zhanghaoch...@cambricon.com>
Subject 回复:Cambricon MLU support for MXNet.
Date Tue, 18 Dec 2018 13:45:27 GMT
Thank you very much for your valuable feedback!

We will submit the design proposal ASAP. At the same time, we will be ready for the appropriate
server or cloud. 

The cambricon libraries related to MXNet are Cambricon Neuware Machine Learning Library (CNML)
and Cambricon Neuware Runtime Library (CNRT). The libraries' documentation will be available
soon.

Look forward to continued participation and contribution in the future.


------------------------------------------------------------------
发件人:Skalicky, Sam <sskalic@amazon.com>
发送时间:2018年12月18日(星期二) 06:03
收件人:dev@mxnet.incubator.apache.org <dev@mxnet.incubator.apache.org>
抄 送:张昊翀 <zhanghaochong@cambricon.com>
主 题:Re: Cambricon MLU support for MXNet.

Hi Haochong,

I am in the process of putting together a design proposal for an accelerator interface for
MXNet that would allow hardware vendors to integrate their runtime with MXNet. I would like
to suggest setting up a time to get together so that we can hear more about your needs to
interface/control your accelerator, and I can share some thought on a generic accelerator
API that I will be proposing. Id be happy to help you prepare a design proposal as well. 

I’ll connect with you separately to setup a time to chat. 

Sam


> On Dec 17, 2018, at 5:49 AM, Pedro Larroy <pedro.larroy.lists@gmail.com> wrote:
> 
> Hi Haochong
> 
> Welcome to MXNet, It's exciting to have additional hardware platforms
> added and supported in the project.
> 
> The CI system for MXNet is donated by AWS to the project. We have a
> small hardware lab with embedded physical hardware like ARM boards
> including NVidia Jetson which we are connecting to the CI system.
> (It's a WIP).
> 
> However, the bulk of the CI system runs in the AWS Cloud using Jenkins
> and EC2 GPU and CPU instances. So even though any of the options you
> mention are possible and could work, I think in the order you
> mentioned them would be the most preferable. Connecting a remote
> server or cloud instance to the MXNet Jenkins would be the easiest
> which wouldn't involve hardware shipping and maintenance.
> 
> I think once you have the contribution merged and the changes ready to
> be tested we can make a plan on how to best integrate with CI. For
> that, the recommendation that Hagay gave (Design proposal in the Wiki)
> is a good path forward, so other members of the community and the
> engineers contributing to the CI system can contribute.
> 
> Pedro.
> 
> On Mon, Dec 17, 2018 at 3:33 AM 张昊翀 <zhanghaochong@cambricon.com> wrote:
>> 
>> Dear MXNet community,
>> 
>> We are from Cambricon, a leading supplier of artificial intelligence chips. We have
two product lines, including IP products (e.g., Cambricon 1A/1H) and chip products (e.g.,
MLU100 released in May 2018)
>> 
>> We are now adapting MXNet on Cambricon products. During the follow-up session, we
plan to open source, and hope to merge these new features into the master branch of MXNet
and to be a part of MXNet's long-term support. We firmly believe that these MLU features will
promote the MXNet community development.
>> To this end, we are ready to accept the rigorous inspection of MXNet community. In
addition, we need advice from the community to achieve high quality implementation. On this
basis, we very much hope to reach a full-scale long-term cooperation with the community.
>> 
>> In order to achieve the above goals, we hope to keep in touch with the community
on some issues. Looking forward to your valuable feedback.
>> 
>> 1. MLU100 mainly focuses on inference, and we plan to first support the inference
part of MXNet. The training part of MXNet on MLU will be released in the future. Is that acceptable
for MXNet community?
>> 
>> 2. Though MLU can support various operators/networks, to guarantee high quality,
all supported operators submitted to the community should undergo rigorous stress test. Thus,
at the beginning, we plan to release a small number of supported operators and networks, and
more of them will be continuously added. Is that acceptable or do we have to support all networks
in the ModelZoo in the first release?
>> 
>> 3. Currently we plan to support both Python and C++ APIs. More details on supported
APIs will be provided in a follow-up proposal.
>> 
>> 4. We need to modify the mShadow in order to support tensor memory operations.
>> 
>> 5. In order to enable the community to run and fully test our code, we want to provide
the community with a complete test environment. At present, we are considering the following
three ways.
>> A) Provides several remote servers for community and integrates with the community's
Jenkins.
>> B) Provide a cloud platform to the community.
>> C) Donate MLU100 to the community's testing platform. However, we don’t know the
specific ways of donation, and we hope to get help. We are wondering about how MXNet's test
servers are managed.
>> 
>> About more technical details, a proposal will be submitted to the community before
releasing the code.
>> 
>> In addition to the above points, the remaining questions and suggestions are also
welcome. Thanks!
>> 
>> More about Cambricon:
>> Cambricon is the artificial intelligence computing pioneer that engineers and successfully
commercializes world’s first dedicated machine learning processor. To bring its unique AI
processors from edge to cloud, enriching and advancing human life, is the firm mission of
the company. Dr. Tianshi Chen is the founder and CEO of Cambricon, where he brings over 10
years experience in the fields of micro-processor architecture and artificial intelligence.
>> In 2016, Cambricon released Cambricon 1A processor, the first commercial machine
learning specific processor in the world. Later, during the 3rd World Internet Conference,
Cambricon 1A processor was elected as one of “World Leading Internet Scientific and Technological
Achievements“. In May 2018, Cambricon released MLU100, a machine learning chip which is
in mass production now. By offering revolutionary technology and products, Cambricon has established
and remains active relationships with various companies in the AI industry.
>> 
>> 
>> Regards,
>> Haochong Zhang
>> Cambricon MXNet Development Team
>> 
>> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message