hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Cosmin Lehene <cleh...@adobe.com>
Subject Re: Why doesn't KeyValue.equals/CellComparator compare the values?
Date Fri, 28 Feb 2014 18:20:43 GMT
Thanks Matt, Stack,

My question/comment was biased by the perspective of a co-processor
implementation, but I guess it may well apply for HBase development.
>From that perspective you're both in HBase-land and Java-land.

A collection of cells needs to be compared to another collection of cells
(I¹m doing a diff).
Java collections will end up comparing individual objects for equality so
it boils down to a Cell object being equal to another Cell object. So from
a java/oo perspective the question is: are two cells with different values
equal (I.e. Can I swap them?)

The HBase answer is indeed yes they are equal as long as row, family,
qualifier, timestamp and type are the same.

The Java answer, however may be different (and hence the expectations of a
developer) as, in general it will be based on the known contract.

And the general hashCode  contract is

* If two objects are equal according to the equals(Object) method, then
calling the hashCode method on each of the two objects must produce the
same integer result.

And the equals javadoc

* Note that it is generally necessary to override the {@code hashCode}
     * method whenever this method is overridden, so as to maintain the
     * general contract for the {@code hashCode} method, which states
     * that equal objects must have equal hash codes.

But in our case, the object equality will pass but hash codes will be
different (https://gist.github.com/clehene/9276434)

It¹s obvious why the behavior is as is in Hbase, so rather than
nitpicking, I wonder whether this could be made obvious as it may help
avoid some unexpected behaviors :)


On 2/27/14, 10:22 AM, "Stack" <stack@duboce.net> wrote:

>On Wed, Feb 26, 2014 at 8:31 PM, Matt Corgan <mcorgan@hotpads.com> wrote:
>> But maybe one of the committers could add a sentence to emphasize that
>> value is excluded.
>We should underline that data is not considered comparing Cells
>(KeyValues).  Apart from the fact that it could make for some interesting
>performance issues, the system isn't plumbed for dealing with coordinates
>that differ in their value only.  Rather, the mvcc/sequenceid is used
>splitting Cells whose coordinates are otherwise the same).
>What was your expectation mighty Cosmin?  What you think HBase should do
>with values that differ in value only?

View raw message