hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cosmin Lehene <cleh...@adobe.com>
Subject Re: Why doesn't KeyValue.equals/CellComparator compare the values?
Date Tue, 11 Mar 2014 00:06:58 GMT
So should there be a Jira for this?

This wouldn’t fully fix my concern though.
I wonder whether the “language” should make it more obvious when dealing
with coordinates (row, family, qualifier, ts) rather than values.

Cosmin

On 3/1/14, 3:30 PM, "Matt Corgan" <mcorgan@hotpads.com> wrote:

>Hmm, I don't think KeyValue.hashCode should be including the value.  I'm
>surprised it hasn't turned up a bug, but maybe that's because there's
>barely any code relying on it.  Looks like KeyValue.equals now farms out
>the work to CellComparator, and maybe KeyValue.hashCode should do the
>same.
> Note that CellComparator.hashCode does not include the value.
>
>
>On Fri, Feb 28, 2014 at 10:20 AM, Cosmin Lehene <clehene@adobe.com> wrote:
>
>> Thanks Matt, Stack,
>>
>> My question/comment was biased by the perspective of a co-processor
>> implementation, but I guess it may well apply for HBase development.
>> From that perspective you're both in HBase-land and Java-land.
>>
>> A collection of cells needs to be compared to another collection of
>>cells
>> (I¹m doing a diff).
>> Java collections will end up comparing individual objects for equality
>>so
>> it boils down to a Cell object being equal to another Cell object. So
>>from
>> a java/oo perspective the question is: are two cells with different
>>values
>> equal (I.e. Can I swap them?)
>>
>> The HBase answer is indeed yes they are equal as long as row, family,
>> qualifier, timestamp and type are the same.
>>
>> The Java answer, however may be different (and hence the expectations
>>of a
>> developer) as, in general it will be based on the known contract.
>>
>> And the general hashCode  contract is
>>
>> * If two objects are equal according to the equals(Object) method, then
>> calling the hashCode method on each of the two objects must produce the
>> same integer result.
>>
>>
>>
>> And the equals javadoc
>>
>> * Note that it is generally necessary to override the {@code hashCode}
>>      * method whenever this method is overridden, so as to maintain the
>>      * general contract for the {@code hashCode} method, which states
>>      * that equal objects must have equal hash codes.
>>
>>
>> But in our case, the object equality will pass but hash codes will be
>> different (https://gist.github.com/clehene/9276434)
>>
>> It¹s obvious why the behavior is as is in Hbase, so rather than
>> nitpicking, I wonder whether this could be made obvious as it may help
>> avoid some unexpected behaviors :)
>>
>> Thanks,
>> Cosmin
>>
>> On 2/27/14, 10:22 AM, "Stack" <stack@duboce.net> wrote:
>>
>> >On Wed, Feb 26, 2014 at 8:31 PM, Matt Corgan <mcorgan@hotpads.com>
>>wrote:
>> >....
>> >
>> >> But maybe one of the committers could add a sentence to emphasize
>>that
>> >> value is excluded.
>> >>
>> >>
>> >We should underline that data is not considered comparing Cells
>> >(KeyValues).  Apart from the fact that it could make for some
>>interesting
>> >performance issues, the system isn't plumbed for dealing with
>>coordinates
>> >that differ in their value only.  Rather, the mvcc/sequenceid is used
>> >splitting Cells whose coordinates are otherwise the same).
>> >
>> >What was your expectation mighty Cosmin?  What you think HBase should
>>do
>> >with values that differ in value only?
>> >
>> >Thanks,
>> >St.Ack
>>
>>

Mime
View raw message