mxnet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tianqi Chen <tqc...@cs.washington.edu>
Subject Re: Does internal quality matters to users?
Date Fri, 31 May 2019 16:31:13 GMT
A good infrastructure design has a long way to go and has a profound impact
on the project itself. That is why we always want to rethink if the
interface can be better done, and think about the next possible
infrastructure to make things better, Refactoring is certainly part of it.

There are usually two types of refactoring we refers to :
1) The major design change, in terms of class relations, data structures
(e.g. numpy support, adding compilation to new hardware)
2) The specific choice of API, programming style(more types or type-erased
program)

(1) affects the long term support of the project, introduces new features
if necessary and need a lot of thoughts into that. I believe the general
IR, compilation and numpy support belongs to that category.

I would particularly like to talk about (2).
Because there is no unified correct answer in software engineering,
different developers may prefer different views on a certain problem.
Some of them have things to do with the taste developers. The change could
favor certain aspect of the project, but not necessarily another part.
Refactoring wrt these sometimes does require a more thoughtful conversation
and make a reasonable compromise.

For example, we have a recent discussion about whether to introduce more
typing into the code base, to the extent that the base data structure could
be templatized.
- The Pros of this approach
    - It introduces more typing and compile-time error message(instead of
runtime checking), could help developers to find problem earlier.
- The Cons of the approach:
   - Having a template in the base data structure causes ABI problem(which
code generated by DLL A vs DLL B) and will have potential future issues.
   - Template sometimes confuses some developers.
   - For serialization, it is hard to anticipate all kinds of classes and
it is easier to have one class(any) that handles polymorphism.
   - Because of most frontends(python) are dynamic, it is easier to
interface them with a type-erased API.

As we can see there are pros and cons of bringing in more typing to the
change, and there is no unified answer.
One good example of a nice infrastructure design trade-off is DLPack
https://github.com/dmlc/dlpack/blob/master/include/dlpack/dlpack.h
This is a base data structure adopted by MXNet, Pytorch, Chainer, and many
other frameworks unanimously.
It is a type-erased data structure that erases the data type, and memory
allocator from the data structure and is designed to exchange tensor(coming
from different memory allocators) across DLL boundaries.
As you can see this is a good example of type-erased data structures.

When we are having this kind of questions. It is important to have a good
conversation. Sometimes we have to make tradeoffs rather than bend
everyone-else to our will. This is what open source is about.
I would also like to give some examples of conversations and how design
decisions are resolved. It comes from the TVM community's recent discussion
about VM design.
I directly paste the github issue conversation here for the sake of
clarity(note that all the conversations are also mirrored to dev@tvm).
The background is that the community want to bring a virtual machine that
can execute dynamic operations more effectively.

- The initial proposal, made by one of the committers gave a detailed
design based on Stack VM https://github.com/dmlc/tvm/issues/2810
   - As you can see that there are quite some discussions about whether we
want to use a different set of design, in this case, a register-based
version.
   - The conversation evolves, and while the community members disagree on
some cases, also agrees with each other on the particular tradeoffs.
- After some discussions, the committers bring a tradeoff design that tries
to consolidate the needs of both sides and this is the final solution being
adopted  https://github.com/dmlc/tvm/issues/2915
I would like to particularly highlight the fact that: 1) there are
disagreements in the development process. 2) developers work together to
understand each others' needs and then make consensus on a perhaps better
design.

There are two other particular conversations between Pedro and myself,
which are during his contributions.
- https://github.com/dmlc/tvm/pull/3037 In this case, I raised the concern
about API consistency, and Pedro brings up a reason why he thinks it is a
better idea, I agreed and we merged the PR
- https://github.com/dmlc/tvm/pull/3108 In this other case, there are
technical reasons for going both sides for the case of MXNet, we have
listed pros/cons about both sides and have a constructive conversation.
Eventually, I decided to not merge the PR after weighing in all the cases.

I believe both are useful conversations, and while Pedro and I disagree
sometimes, we do agree on many other cases. The most crucial part is about
having a constructive conversation.
To summarize, I do refactoring and making things better is certainly
important to make the project better. And I do believe it is crucial to
think about all the technical consequences and make good tradeoff
decisions.
Sometimes the decision may not make every developer mostly happy, but a
good technical compromise could move the project forward and help the
community in general.

Tianqi

On Fri, May 31, 2019 at 6:26 AM Isabel Drost-Fromm <isabel@apache.org>
wrote:

>
>
> Am 31. Mai 2019 14:13:30 MESZ schrieb Pedro Larroy <
> pedro.larroy.lists@gmail.com>:
> > I think Martin does a very good job explaining why
> >refactoring,
> >reducing developer frustration and internal improvement is a crucial
> >productivity multiplier which includes lower cost to ship features,
> >less
> >bugs and time spent debugging.
>
> There's one aspect that's special for open source projects: if a project
> wants to survive long term, it should make it easy for people to get
> started working on the project. In my experience, refactoring and cleanup
> play an important role in that. So thanks also for making recruiting of new
> contributers better.
>
> Isabel
> --
> This message was sent with K-9 from a mobile device with swipe to type
> enabled. I'm sorry for any embarrassing typos that slipped through.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message