hadoop-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From William Slacum <wsla...@gmail.com>
Subject Re: Custom comparator when using Kryo serializer for MapReduce serialization
Date Fri, 21 Aug 2015 15:35:25 GMT
Being in a distributed system shouldn't matter too much in this case.
You're worried about two things: mapping your data into byte[], and then
comparing against other data that has been mapped to byte[].

On Fri, Aug 21, 2015 at 1:46 AM, Yaron Gonen <yaron.gonen@gmail.com> wrote:

> Thanks for the reply.
> How can I guarantee that in a distributed system?
> On Wed, Aug 19, 2015 at 8:06 PM, William Slacum <wslacum@gmail.com> wrote:
>> In a general sense, if you can guarantee that your objects serialize in
>> lexicographical order, then you should be able to do a comparator on the
>> raw bytes themselves without any interpretation.
>> On Wed, Aug 19, 2015 at 5:21 AM, Yaron Gonen <yaron.gonen@gmail.com>
>> wrote:
>>> Hi all,
>>> (I'm using Hadoop 1.2.1)
>>> I'm using Kryo <https://github.com/EsotericSoftware/kryo> (with chill
>>> <https://github.com/twitter/chill>) as my serializer (instead of the
>>> Writable interface).
>>> However, I'm having trouble with the comparator: on one hand, since none
>>> of my objects are Writable, I cannot use WritableComparator. On the
>>> other hand, I can work with the RawComparator, but it means to
>>> deserialize the bytes array each time - seems not very efficient...
>>> Is there a way to give just an implementation of Java's Comparator? or
>>> to make the serialized object Comparable?
>>> Regards,
>>> Yaron

View raw message