mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Dunning" <>
Subject Re: In case you haven't noticed
Date Sat, 18 Oct 2008 01:52:22 GMT
On Fri, Oct 17, 2008 at 2:37 PM, Sean Owen <> wrote:

> ... I am picky because I take a view on code quality like the New York
> Police Department's 'broken windows' theory - you clean up petty crime
> in an area and it actually discourages larger crime.

This sounds great.

> Hence my ongoing campaign to tweak and tidy according to a logic and
> set of standards that I hope we share. (Still a lot of little
> boxing/unboxing issues for example!

Many boxing/unboxing issues are no longer issues due to JIT inlining and
related optimizations.  This is particularly true in JDK 1.6.

> I think objects are way overused
> for code that is supposed to be performance-conscious. Primitives
> should always be your default.)

Some of this is good, but a sense of perspective is useful.  For instance,
for probability distributions, it is nice to have an interface that
describes a generic sampler that returns some generic type of object.  This
might be a double for a gaussian or an integer for a Poisson distribution or
a matrix for Wishart.  This sounds bad because it uses objects instead of
primitives so supposedly the sample<T> function will be slower.

In fact, this won't normally be any slower at all.  The reason is that for
really simple samplers, they will get inlined and the boxing will get
optimized away.

For more complex samplers like a Monte Carlo sampler, the sample method
won't get inlined, but that will only happen in cases where the sample
method is expensive enough so that you won't notice the boxing.  There is
also a high correlation between cases which don't inline and cases which
don't return primitives which means that the boxing cost was inevitable

So I would say to go gently into the process of deleting all object use.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message