lucene-java-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Schlansker <ste...@likeness.com>
Subject Re: BytesRef equals() method
Date Tue, 21 Jan 2014 18:54:19 GMT

On Jan 21, 2014, at 7:32 AM, Yann-Erwan Perio <ye.perio@gmail.com> wrote:

> Hello,
> 
> I have been working a bit with BytesRef recently, and I wonder whether
> the content of the equals() method, and more specifically the content
> of the bytesEquals(BytesRef other) method, is the intended one.
> 
> I was made aware of this because I used a Map<BytesRef, ...> in the
> collector, and the map would sometimes give inconsistent results.
> Checking out the source code, the hashcode() method looks valid to me,
> but the bytesEquals() method looks strange - because prior to
> comparing the real value of the BytesRef, it checks their lengths -
> and AIUI these may differ, even though BytesRef are logically equal.

How can two byte arrays be equal if they have different lengths?
Same way as two Strings with differing lengths can never be equal, two
byte arrays with different lengths will never be equivalent.

> 
> I am not familiar at all with the internals of Lucene (this includes
> the BytesRef mechanics), so I may be completely wrong here. FWIW, I
> solved my problem by creating fresh BytesRef from the ones sent by the
> similarity, using the copyBytes method.

copyBytes doesn’t change the length of the BytesRef, so two unequal BytesRef
instances cannot become equal solely through a copyBytes call, by my reading?

> I could also have used the
> string representation of the BytesRef, but this appears to be slower
> than copying the bytes, by a magnitude of about 2.5.

Not all bytes are valid representations of Strings, so don’t do this unless
you are very sure you are dealing with character data and know the encoding.

It’s also not surprising that this is slower, given that creating a String
not only involves copying all the bytes but also decoding them into characters.


What differently-sized byte arrays would you expect to compare as equals?

Best,
Steven


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Mime
View raw message