hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Enis Soztutar (Commented) (JIRA)" <>
Subject [jira] [Commented] (HIVE-2903) Numeric binary type keys are not compared properly
Date Tue, 27 Mar 2012 01:01:36 GMT


Enis Soztutar commented on HIVE-2903:

Well, it is not a "bug" of hbase. HBase only provides int -> byte[] conversion as a convenience,
and it seems that Bytes.toBytes(int) and others only guarantees lexicographic ordering for
unsigned numbers. We can definitely add something like Bytes.toSignedBytes() in HBase so that
you can ensure signed numbers are sorted correctly in lexicographic order.

Coming to Hive, I think Ashutosh is right, that we have to keep supporting already existing
data in hbase serialized through Bytes.toBytes(). So, I would suggest we add another storage
type (, like "signedbinary", which should do the hive-specific
signed byte conversion. 

So, we would have: 
 - cf:col#string       : serialize as string
 - cf:col#binary       : serialize as binary, compatible with Bytes.toBytes() 
 - cf:col#signedBinary : serialize as signed binary. 

I would also suggest that, people might be interested in custom ser/de from Hive types to
byte[], but I am not sure how feasible that would be to implement. 
> Numeric binary type keys are not compared properly
> --------------------------------------------------
>                 Key: HIVE-2903
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: HBase Handler
>            Reporter: Navis
>            Assignee: Navis
>         Attachments: HIVE-2903.D2481.1.patch
> In current binary format for numbers, minus values are always greater than plus values,
for example.
> {code}
> System.our.println(Bytes.compareTo(Bytes.toBytes(-100), Bytes.toBytes(100))); // 255
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:!default.jspa
For more information on JIRA, see:


View raw message