hadoop-pig-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gianmarco <gianmarco....@gmail.com>
Subject Re: how to compare?
Date Wed, 28 Apr 2010 10:57:05 GMT
Basically, DataType.compare() just calls the compareTo() method of the two
objects after checking that the two types are the same.
However, DataType.compare() does 2 things more than a simple compareTo().

Firts, it is specialized for Maps, for which sizes are taken into account
and keys are sorted.

Second, it imposes an (arbitrary) order on different data types. In this way
the types are not dependent on each other and there is a single point of

So I think you should use DataType.compare() unless you are sure you do not
need these features.

Anyway, there is something that I do not understand.

What I do not understand is why the function needs to switch on the datatype
byte and cast the objects before calling the compareTo on them. Just casting
them to Comparable and letting Java run the proper polymorphic method should
work as well, right?

On Wed, Apr 28, 2010 at 07:12, hc busy <hc.busy@gmail.com> wrote:

> guys, I'm implementing that ExtremalTupleByNthField and I have a question
> about comparison...
> So, when I have parsed out the two objects that I want to compare how do I
> perform that comparison? My current implementation assumes the data is
> Comparable (which they invariably are within pig) so I do
> int c = ((Comparable)o1).compareTo((Comparable)o2);
> now I also see that there's another compare that compares the two objects
> by:
> int c = DataType.compare(o1, o2, DataType.findType(o1),
> DataType.findType(o2));
> The initial methods works for all types I've tried (int, string, etc.) But
> the latter is used by another UDF already in SVN.
> What are your suggestions?
> (PIG-1386 is ticket where I've checked in the patch).

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message