mahout-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Owen" <sro...@gmail.com>
Subject Java style notes
Date Fri, 22 Aug 2008 13:31:26 GMT
As you may know, I love my Java and am terribly picky about the
details. You may have seen me "helpfully" tweaking the code over time
here and I hope that's OK with everyone. I assure you it will lead to
faster, sleeker, correct-er code.

One thing I'm seeing a whole whole lot of in the code is a lot of
boxing / unboxing overhead. Conveniently, primitives like double are
virtually interchangeable with their object counterparts like Double
-- Java does the conversion for you. However this leads to some
invisible but potentially large overhead. For example I saw a loop
like this:

Double total = new Double(0.0);
for (...) {
  Double value = new Double(someDoubleObject.doubleValue());
  total += value * value;
}

Aside from the unnecessary object allocations you see, there is at
least one you don't -- updating total make a new object every time.
There are two calls to doubleValue() you don't see as well. Instead:

double total = 0.0;
for (...) {
  double value = someDoubleObject.doubleValue();
  total += value * value;
}

It's not just the memory -- though on a 64-bit platform, a Double
requires 20 bytes: 8 byte object reference, 4 byte object 'header', 8
byte double value. The allocation takes nontrivial time, and,
nontrivial time to GC later.

I think the rules is basically -- always use a primitive unless you
can't. You have to use objects when putting values into a Collection
or Map. Sometimes you also want to distinguish between a value and no
value, in which case an object reference allows for "null". Otherwise,
it's all primitives.

Mime
View raw message