hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shevek <had...@anarres.org>
Subject Re: Implementing compareTo in user-written keys where one extends the other is error prone
Date Fri, 01 May 2009 14:09:29 GMT
So, I'll pull out the pertinent bits of your mail, please excuse the
extreme trimming, and explain why this happened at the bottom.

On Thu, 2009-04-30 at 17:31 -0400, Marshall Schor wrote:

A superclass which sorts on a String field:

> public class Super implements WritableComparable<Super> {

A subclass:

> public class Sub extends Super {
>  . . .
>   public int compareTo(Sub o) {

> What actually happened, was that the sort on the boolean value was
> skipped completely, and only the sort on the string was done.

[SNIP]
> ... they are passed via this code in
> WritableComparator:
> 
>  public int compare(WritableComparable a, WritableComparable b) {
>     return a.compareTo(b);
>   }
> 
> Java uses the interface spec for WritableComparable that was declared,
> in this case WritableComparable<Super>, and infers that the arg type for
> the compareTo is Super.  So it "skips" calling the compareTo in Sub, and
> just calls the one in Super.

Right.

> The workaround is to change the signature of Sub's compareTo method to
> match the spec in the interface, namely it has to take the Super as an
> argument, and then cast it to Sub.

Right. (As you'll see in a minute, this is what the JVM does anyway, but
starting from Object.)

> This seems like a very error prone design.  Am I doing something wrong,
> or can this be improved so that this kind of error is avoided?

Yes, you're doing something wrong. You are misunderstanding type erasure
and its implementation in Java.

At the simplest level, this class implements Comparable<Super> but never
implements Comparable<Sub>, and the method compare(Sub o) is NOT AN
OVERRIDE. I haven't tested, but if you make Sub implement
Comparable<Sub>, it might work as you intend.

If in doubt, and in any case where you believe a method to override
another method, USE THE @Override ANNOTATION. In fact, if someone were
to go through every Java 5 codebase in the world and add @Override
everywhere pertinent, it would be an improvement. Even in C++, I don't
think this method combination would be a dynamic override. You'd be
relying on enough static type information to be available to let the
compiler determine the Sub-call, which it probably wouldn't.

With the rant over, let's look at why this particular confusion happens.

Type erasure means that in the class file, Comparable drops its
parameters. Java does not know whether it's a Comparable<A> or a
Comparable<B>, and if it all goes wrong, you'll get a runtime
ClassCastException just as you would have in old-land.

The actual method is still compareTo(Object), and so when we write a
method compareTo(Super), how does it get called?

The answer is that the Java compiler compiles an extra hidden method
compareTo(Object) which forwards to compareTo(Super). I happen to have
the following example lying around, so I'll use it:

abstract class Functor<I,O> {
        public abstract O eval(I x);
}

Remember this is really Object eval(Object x) since Object is the type
bound for both I and O.

class MyFunctor extends Functor<String,String> {
        @Override
        public String eval(String in) { return in; }
}

Now, let's compile both classes and have a look:

Method name:"eval" public Signature: (java.lang.String)java.lang.String
Attribute "Code", length:26, max_stack:1, max_locals:2, code_length:2
  0: aload_1
  1: areturn

That's our expected eval method.

Method name:"eval" public volatile Signature:
(java.lang.Object)java.lang.Object
Attribute "Code", length:33, max_stack:2, max_locals:2, code_length:9
  0: aload_0
  1: aload_1
  2: checkcast <Class java.lang.String>
  5: invokevirtual <Method MyFunctor.eval
(java.lang.String)java.lang.String>
  8: areturn

This is an EXTRA method compiled by javac because the compile time type
information tells it that eval(String) is an implementation of
eval(Object) because of the relationship created by the type variables I
and O. Note the extra checkcast, and that this method has an attribute
of volatile (which is interestingly not documented as a valid method
attribute in the JVM spec).

In the case of Comparable, if you dump your class file, you will find
that compareTo(Object) forwards to compareTo(Super), hence the
a.compareTo(b) call calls compareTo(Super) and at no point does anything
know to call compareTo(Sub).

Either way, an @Override annotation will tell you at compile-time
whether you got it right, and I'm a firm believer in knowing whether
your code will work BEFORE deployment.

S.


Mime
View raw message