hadoop-hdfs-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Adeel Qureshi <adeelmahm...@gmail.com>
Subject Re: WritableComparable.compareTo vs RawComparator.compareTo
Date Sat, 31 Aug 2013 16:20:49 GMT
Thanks for the information. So is the reason that makes the raw comparator
faster is because we can use the bytes to do the comparison .. so if I use
the signature of compareTo in my raw comparator that receives two
writablecomparable objects

public int compare(WritableComparable a, WritableComparable b)

instead of the bytes one .. then does it ends up slower and more comparable
to the compareTo method defined on the WritableComparable object itself

Secondly if I do use the bytes signature and I have seen implementations
where you can use util methods like readInt and readString to read int and
strings from those bytes but what if I have a complex object inside my
writablecomparable such as Text or List .. how can I read those from bytes.


On Aug 31, 2013 3:58 AM, "Ravi Kiran" <ravikiranmagham@gmail.com> wrote:

> Also, if both are defined , the framework will use RawComparator . I hope
> you have registered the comparator in a static block as follows
> static
> {
> WritableComparator.define(PairOfInts.class, new Comparator());
>  }
> Regards
> Ravi Magham
> On Sat, Aug 31, 2013 at 1:23 PM, Ravi Kiran <ravikiranmagham@gmail.com>wrote:
>> Hi Adeel,
>>     The RawComparator is the fastest between the two as you avoid the
>> need to convert the byte stream to Writable objects for comparison .
>> Regards
>> Ravi Magham
>> On Fri, Aug 30, 2013 at 11:16 PM, Adeel Qureshi <adeelmahmood@gmail.com>wrote:
>>> For secondary sort I am implementing a RawComparator and providing that
>>> as sortComparator .. is that the faster way or using a WritableComparable
>>> as mapper output and defining a compareTo method on the key itself
>>> also what happens if both are defined, is one ignored

View raw message