mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pedro Larroy <>
Subject Re: Does internal quality matters to users?
Date Tue, 11 Jun 2019 17:18:22 GMT
Thanks for the good discussion.

I actually wasn't referring particularly to our conversations in
github with respect of the refactors, but it's nice from you to bring
them up. And it's ok to disagree in small things, hopefully we can
align in the big ones.

I understand that for TVM you might have different constraints with
how dynamic you want to be for mutating the graph and doing quick
experimentation and research but please try to understand my
perspectives coming from a software engineering background and helping
maintain MXNet for thousands of users and teams using it in
production. Let's also consider how many issues we have open and our
bandwidth to deal with additional complexity.

To your pros and cons I would like to add and emphasize that currently
the heavy use of dynamic attributes in the graph using dmlc::any has
two very negative consequences, at least for MXNet:

1 - Makes the data structures using dmlc::any almost impossible to
debug, as they are just binary.
2 - Makes the code more difficult to understand because there's no
declaration in a data structure of the data fields it uses and its
responsibilities. We are basically shoving all kinds of stuff using
3 - You get no help from the IDE to navigate and refactor as another

I would really like you to give me solutions to these problems or at
least acknowledge them and tell me why do we have to pay those
tradeoffs instead of just dismissing them as engineering taste.

The more I work with MXNet the more I wish debugging was easier, and
reading and refactoring the code, and those fields would be declared
and typed in their corresponding data structures, for MXNet I don't
think this would affect anything in regards the python bindings since
they go through the typed C API anyway.

Maybe we can get some inspiration from LLVM as they have bindings for
many languages to work with the AST and have very clean APIs for the
compilation steps. It's also OK to have an initial dynamic codebase
for research and experimentation and then "cure" them into a solid
maintainable one with more types and more robust...


On Fri, May 31, 2019 at 9:31 AM Tianqi Chen <> wrote:
> A good infrastructure design has a long way to go and has a profound impact on the project
itself. That is why we always want to rethink if the interface can be better done, and think
about the next possible infrastructure to make things better, Refactoring is certainly part
of it.
> There are usually two types of refactoring we refers to :
> 1) The major design change, in terms of class relations, data structures (e.g. numpy
support, adding compilation to new hardware)
> 2) The specific choice of API, programming style(more types or type-erased program)
> (1) affects the long term support of the project, introduces new features if necessary
and need a lot of thoughts into that. I believe the general IR, compilation and numpy support
belongs to that category.
> I would particularly like to talk about (2).
> Because there is no unified correct answer in software engineering, different developers
may prefer different views on a certain problem.
> Some of them have things to do with the taste developers. The change could favor certain
aspect of the project, but not necessarily another part.
> Refactoring wrt these sometimes does require a more thoughtful conversation and make
a reasonable compromise.
> For example, we have a recent discussion about whether to introduce more typing into
the code base, to the extent that the base data structure could be templatized.
> - The Pros of this approach
>     - It introduces more typing and compile-time error message(instead of runtime checking),
could help developers to find problem earlier.
> - The Cons of the approach:
>    - Having a template in the base data structure causes ABI problem(which code generated
by DLL A vs DLL B) and will have potential future issues.
>    - Template sometimes confuses some developers.
>    - For serialization, it is hard to anticipate all kinds of classes and it is easier
to have one class(any) that handles polymorphism.
>    - Because of most frontends(python) are dynamic, it is easier to interface them with
a type-erased API.
> As we can see there are pros and cons of bringing in more typing to the change, and there
is no unified answer.
> One good example of a nice infrastructure design trade-off is DLPack
> This is a base data structure adopted by MXNet, Pytorch, Chainer, and many other frameworks
> It is a type-erased data structure that erases the data type, and memory allocator from
the data structure and is designed to exchange tensor(coming from different memory allocators)
across DLL boundaries.
> As you can see this is a good example of type-erased data structures.
> When we are having this kind of questions. It is important to have a good conversation.
Sometimes we have to make tradeoffs rather than bend everyone-else to our will. This is what
open source is about.
> I would also like to give some examples of conversations and how design decisions are
resolved. It comes from the TVM community's recent discussion about VM design.
> I directly paste the github issue conversation here for the sake of clarity(note that
all the conversations are also mirrored to dev@tvm).
> The background is that the community want to bring a virtual machine that can execute
dynamic operations more effectively.
> - The initial proposal, made by one of the committers gave a detailed design based on
Stack VM
>    - As you can see that there are quite some discussions about whether we want to use
a different set of design, in this case, a register-based version.
>    - The conversation evolves, and while the community members disagree on some cases,
also agrees with each other on the particular tradeoffs.
> - After some discussions, the committers bring a tradeoff design that tries to consolidate
the needs of both sides and this is the final solution being adopted
> I would like to particularly highlight the fact that: 1) there are disagreements in the
development process. 2) developers work together to understand each others' needs and then
make consensus on a perhaps better design.
> There are two other particular conversations between Pedro and myself, which are during
his contributions.
> - In this case, I raised the concern about API
consistency, and Pedro brings up a reason why he thinks it is a better idea, I agreed and
we merged the PR
> - In this other case, there are technical reasons
for going both sides for the case of MXNet, we have listed pros/cons about both sides and
have a constructive conversation. Eventually, I decided to not merge the PR after weighing
in all the cases.
> I believe both are useful conversations, and while Pedro and I disagree sometimes, we
do agree on many other cases. The most crucial part is about having a constructive conversation.
> To summarize, I do refactoring and making things better is certainly important to make
the project better. And I do believe it is crucial to think about all the technical consequences
and make good tradeoff decisions.
> Sometimes the decision may not make every developer mostly happy, but a good technical
compromise could move the project forward and help the community in general.
> Tianqi
> On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <> wrote:
>> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <>:
>> > I think Martin does a very good job explaining why
>> >refactoring,
>> >reducing developer frustration and internal improvement is a crucial
>> >productivity multiplier which includes lower cost to ship features,
>> >less
>> >bugs and time spent debugging.
>> There's one aspect that's special for open source projects: if a project wants to
survive long term, it should make it easy for people to get started working on the project.
In my experience, refactoring and cleanup play an important role in that. So thanks also for
making recruiting of new contributers better.
>> Isabel
>> --
>> This message was sent with K-9 from a mobile device with swipe to type enabled. I'm
sorry for any embarrassing typos that slipped through.

View raw message