phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Updated] (PHOENIX-2067) Sort order incorrect for variable length DESC columns
Date Tue, 14 Jul 2015 05:10:04 GMT


James Taylor updated PHOENIX-2067:
    Attachment: PHOENIX-2067_array_addendum_v2.patch

[~Dumindux] and [~ram_krish] - would you guys mind reviewing this patch? This ensures that
descending, variable length arrays sort correctly. The change is the use a 255 byte as the
separator for non null values (including the terminators). See the couple of new tests in

Much of the type changes are just formatting and moving a couple of duplicated methods where
they belong at the base type. The other changes are to ensure we keep the same separator -
for example if a table has not been upgraded, we need to keep using the 0 byte separator.
That's where most of the complication comes in.

Much appreciated. [~Dumindux] - if you have time perhaps you can write a couple of lower level
unit tests to confirm that my isRowKeyOrderOptimized works in all situations and that array_cat,
prepend, and append work in that they maintain the same separator byte that the array is already

> Sort order incorrect for variable length DESC columns
> -----------------------------------------------------
>                 Key: PHOENIX-2067
>                 URL:
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.4.0
>         Environment: HBase 0.98.6-cdh5.3.0
> jdk1.7.0_67 x64
> CentOS release 6.4 (2.6.32-358.el6.x86_64)
>            Reporter: Mykola Komarnytskyy
>            Assignee: James Taylor
>         Attachments: PHOENIX-2067_array_addendum.patch, PHOENIX-2067_array_addendum_v2.patch,
PHOENIX-2067_v1.patch, PHOENIX-2067_v2.patch, PHOENIX-2067_v3.patch
> Steps to reproduce:
> 1. Create a table: 
> CREATE TABLE mytable (id BIGINT not null PRIMARY KEY, timestamp BIGINT, log_message varchar)
> 2. Create two indexes:
> CREATE INDEX mytable_index_search ON mytable(timestamp,id) INCLUDE (log_message) SALT_BUCKETS=16;
> CREATE INDEX mytable_index_search_desc ON mytable(timestamp DESC,id DESC) INCLUDE (log_message)
> 3. Upsert values:
> UPSERT INTO mytable VALUES(1, 1434983826018, 'message1');
> UPSERT INTO mytable VALUES(2, 1434983826100, 'message2');
> UPSERT INTO mytable VALUES(3, 1434983826101, 'message3');
> UPSERT INTO mytable VALUES(4, 1434983826202, 'message4');
> 4. Sort DESC by timestamp:
> select timestamp,id,log_message from mytable ORDER BY timestamp DESC;
> Failure: data is sorted incorrectly. In case when we have two longs which  are different
only by last two digits (e.g. 1434983826155, 1434983826100)  and one of the long ends with
'00' we receive incorrect order. 
> Sorting result:
> 1434983826202
> 1434983826100
> 1434983826101
> 1434983826018

This message was sent by Atlassian JIRA

View raw message