cassandra-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Edward Capriolo <edlinuxg...@gmail.com>
Subject Re: Code quality, principles and rules
Date Fri, 17 Mar 2017 13:46:44 GMT
On Fri, Mar 17, 2017 at 6:41 AM, Ryan Svihla <rs@foundev.pro> wrote:

> Different DI frameworks have different initialization costs, even inside of
> spring even depending on how you wire up dependencies (did it use autowire
> with reflection, parse a giant XML of explicit dependencies, etc).
>
> To back this assertion up for awhile in that community benching different
> DI frameworks perf was a thing and you can find benchmarks galore with a
> quick Google.
>
> The practical cost is also dependent on the lifecycles used (transient
> versus Singleton style for example) and features used (Interceptors
> depending on implementation can get expensive).
>
> So I think there should be some quantification of cost before a framework
> is considered, something like dagger2 which uses codegen I wager is only a
> cost at compile time (have not benched it, but looking at it's feature set,
> that's my guess) , Spring I know from experience even with the most optimal
> settings is slower on initialization time than doing by DI "by hand" at
> minimum, and that can sometimes be substantial.
>
>
> On Mar 17, 2017 12:29 AM, "Edward Capriolo" <edlinuxguru@gmail.com> wrote:
>
> On Thu, Mar 16, 2017 at 5:18 PM, Jason Brown <jasedbrown@gmail.com> wrote:
>
> > >> do we have plan to integrate with a dependency injection framework?
> >
> > No, we (the maintainers) have been pretty much against more frameworks
> due
> > to performance reasons, overhead, and dependency management problems.
> >
> > On Thu, Mar 16, 2017 at 2:04 PM, Qingcun Zhou <zhouqingcun@gmail.com>
> > wrote:
> >
> > > Since we're here, do we have plan to integrate with a dependency
> > injection
> > > framework like Dagger2? Otherwise it'll be difficult to write unit test
> > > cases.
> > >
> > > On Thu, Mar 16, 2017 at 1:16 PM, Edward Capriolo <
> edlinuxguru@gmail.com>
> > > wrote:
> > >
> > > > On Thu, Mar 16, 2017 at 3:10 PM, Jeff Jirsa <jjirsa@apache.org>
> wrote:
> > > >
> > > > >
> > > > >
> > > > > On 2017-03-16 10:32 (-0700), François Deliège <
> > francois@instagram.com>
> > > > > wrote:
> > > > > >
> > > > > > To get this started, here is an initial proposal:
> > > > > >
> > > > > > Principles:
> > > > > >
> > > > > > 1. Tests always pass.  This is the starting point. If we don't
> care
> > > > > about test failures, then we should stop writing tests. A recurring
> > > > failing
> > > > > test carries no signal and is better deleted.
> > > > > > 2. The code is tested.
> > > > > >
> > > > > > Assuming we can align on these principles, here is a proposal
for
> > > their
> > > > > implementation.
> > > > > >
> > > > > > Rules:
> > > > > >
> > > > > > 1. Each new release passes all tests (no flakinesss).
> > > > > > 2. If a patch has a failing test (test touching the same code
> > path),
> > > > the
> > > > > code or test should be fixed prior to being accepted.
> > > > > > 3. Bugs fixes should have one test that fails prior to the fix
> and
> > > > > passes after fix.
> > > > > > 4. New code should have at least 90% test coverage.
> > > > > >
> > > > > First I was
> > > > > I agree with all of these and hope they become codified and
> > followed. I
> > > > > don't know anyone who believes we should be committing code that
> > breaks
> > > > > tests - but we should be more strict with requiring green test
> runs,
> > > and
> > > > > perhaps more strict with reverting patches that break tests (or
> cause
> > > > them
> > > > > to be flakey).
> > > > >
> > > > > Ed also noted on the user list [0] that certain sections of the
> code
> > > > > itself are difficult to test because of singletons - I agree with
> the
> > > > > suggestion that it's time to revisit CASSANDRA-7837 and
> > CASSANDRA-10283
> > > > >
> > > > > Finally, we should also recall Jason's previous notes [1] that the
> > > actual
> > > > > test infrastructure available is limited - the system provided by
> > > > Datastax
> > > > > is not generally open to everyone (and not guaranteed to be
> > permanent),
> > > > and
> > > > > the infrastructure currently available to the ASF is somewhat
> limited
> > > > (much
> > > > > slower, at the very least). If we require tests passing (and I
> agree
> > > that
> > > > > we should), we need to define how we're going to be testing (or how
> > > we're
> > > > > going to be sharing test results), because the ASF hardware isn't
> > going
> > > > to
> > > > > be able to do dozens of dev branch dtest runs per day in its
> current
> > > > form.
> > > > >
> > > > > 0: https://lists.apache.org/thread.html/
> > f6f3fc6d0ad1bd54a6185ce7bd7a2f
> > > > > 6f09759a02352ffc05df92eef6@%3Cuser.cassandra.apache.org%3E
> > > > > 1: https://lists.apache.org/thread.html/
> > 5fb8f0446ab97644100e4ef987f36e
> > > > > 07f44e8dd6d38f5dc81ecb3cdd@%3Cdev.cassandra.apache.org%3E
> > > > >
> > > > >
> > > > >
> > > > Ed also noted on the user list [0] that certain sections of the code
> > > itself
> > > > are difficult to test because of singletons - I agree with the
> > suggestion
> > > > that it's time to revisit CASSANDRA-7837 and CASSANDRA-10283
> > > >
> > > > Thanks for the shout out!
> > > >
> > > > I was just looking at a patch about compaction. The patch was to
> > > calculate
> > > > free space correctly in case X. Compaction is not something that
> > requires
> > > > multiple nodes to test. The logic on the surface seems simple: find
> > > tables
> > > > of similar size and select them and merge them. The reality is it
> turns
> > > out
> > > > now to be that way. The coverage itself both branch and line may be
> > very
> > > > high, but what the code does not do is directly account for a wide
> > > variety
> > > > of scenarios. Without direct tests you end up with a mental
> > approximation
> > > > of what it does and that varies from person to person and accounts
> for
> > > the
> > > > cases that fit in your mind. For example, you personally are only
> > running
> > > > LevelDB inspired compaction.
> > > >
> > > > Being that this this is not a multi-node problem you should be able
> to
> > re
> > > > factor this heavily. Pulling everything out to a static method where
> > all
> > > > the parameters are arguments, or inject a lot of mocks in the current
> > > code,
> > > > and develop some scenario based coverage.
> > > >
> > > > That is how i typically "rescue" code I take over. I look at the
> > > nightmare
> > > > and say, "damn i am really afraid to touch this". I construct 8
> > scenarios
> > > > that test green. Then I force some testing into it through careful re
> > > > factoring. Now, I probably know -something- about it. Now, you are
> > fairly
> > > > free to do a wide ranging refactor, because you at least counted for
> 8
> > > > scenarios and you put unit test traps so that some rules are
> enforced.
> > > (Or
> > > > the person changing the code has to actively REMOVE your tests
> > asserting
> > > it
> > > > was not or no longer is valid). Later on you (or someone else)
> > __STILL__
> > > > might screw the entire thing up, but at least you can now build
> > forward.
> > > >
> > > > Anyway that patch on compaction was great and I am sure it improved
> > > things.
> > > > That being said it did not add any tests :). So it can easily be
> undone
> > > by
> > > > the next person who does not understand the specific issue trying to
> be
> > > > addressed. Inline comments almost scream to me 'we need a test' not
> > > > everyone believes that.
> > > >
> > >
> > >
> > >
> > > --
> > > Thank you & Best Regards,
> > > --Simon (Qingcun) Zhou
> > >
> >
>
> I don't believe dependency injection frameworks cause "overhead". For
> example, if you are using spring
>
> @Bean(initMethod=init, destroyMethod=destroy)
> public Server getMeAServer( Component component) {}
>
> @Bean(initMethod=init, destroyMethod=destroy)
> public Component getMeAComponent(){}
>
> What spring actually does is something like thisthis:
>
> startup code
> Component c = new Component();
> c.init()
> Server s = new Server(c);
> s.init()
> The default spring prototype is "singleton" aka create one of these per
> bean context.
>
> Spring also does the reverse on shutdown.
>
> The static singleton initialization thing creates a host of problems, with
> things not closing cleanly.
>
> https://issues.apache.org/jira/browse/CASSANDRA-12844
> https://issues.apache.org/jira/browse/CASSANDRA-12728
>
> One of them I even fixed:
> https://github.com/stef1927/cassandra/commits/11984-2.2
>
> So beyond it not creating any performance problems it probably would help
> in practice by forcing code into a pattern which is not singleton
> initiation driven by classloader which is something very opaque and hard to
> troubleshoot. Try putting a break point for example on something loaded by
> a static singleton and try to reason about the order of initialization.
>

Different DI frameworks have different initialization costs, even inside of
spring even depending on how you wire up dependencies (did it use autowire
with reflection, parse a giant XML of explicit dependencies, etc).

Spring I know from experience even with the most optimal
settings is slower on initialization time than doing by DI "by hand" at
minimum, and that can sometimes be substantial.

You are considering the cost of reflection (nano seconds) vs the cost of
starting a database that has to read files from disk. (seconds to minutes)

"parse a giant XML of explicit dependencies"

Modern (spring) projects rarely do what you are describing. The big xml
files are gone: https://projects.spring.io/spring-boot/. If you were not
going to use spring (cause you are afraid of big xml files) there is
https://github.com/google/guice.

It is basically a straw man argument to say "we don't like DI because of
big XML files" while papering over the fact that drain really does not shot
down cassandra properly and it has been project for a long while.

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message