phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-4324) Immutable indexes are invalid with null fixed width indexed column values
Date Tue, 14 Nov 2017 18:22:00 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-4324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251880#comment-16251880
] 

James Taylor commented on PHOENIX-4324:
---------------------------------------

This has the potential to break functionality that uses the meta data in the SYSTEM.CATALOG
to interpret the row key. The fundamental issue is that the data type use by the IndexMaintainer
does not match the "source of truth" type in the system catalog. The IndexMaintainer should
translate the data type using IndexUtil.getIndexColumnDataType(). It's unclear why this is
happening for mutable but not immutable indexes. 

The reason we do this translation from fixed width type to variable width type is so that
we can handle null correctly. When a KeyValue column is indexed, it can be absent in the data
table row and thus have a null value for the column in the row key for the index row. Fixed
width types do not have a way to represent null, while variable length types do - any empty
byte represents null for all variable length types.

An example would be as follows:
{code}
CREATE IMMUTABLE TABLE T (K VARCHAR PRIMARY KEY, V1 SMALLINT);
CREATE INDEX IDX ON T(V1);
{code}
The expected row key structure of the index would be DECIMAL+VARCHAR, but instead the IndexMaintainer
ends up writing the data as SMALLINT+VARCHAR. Since DECIMAL is variable length, the index
row key can lead with a \0 byte for rows in the data table that don't have a V1 value. As
a SMALLINT, I'm not entirely sure what would be used - maybe 2 \0 bytes.

A simple query that uses the index would still work because we always coerce the index data
type to the data table type. In this case, due to this bug, the coerce would be a noop and
it appears to work. 

I suspect the following types of things would not function correctly:
- queries that use IS NULL will not always work, depending on the value of V1. I suspect negative
numbers would be problematic. Definitely a value of Short.MIN_VALUE would be a problem.
- queries that compute some expression with V1 instead of just selecting it would probably
not work. For example, SELECT V1 + 5 FROM T WHERE V1 > 100.
- query more functionality that uses the index may not function correctly (as I believe query
more relies on the data type from the index table row in system catalog).

> Immutable indexes are invalid with null fixed width indexed column values
> -------------------------------------------------------------------------
>
>                 Key: PHOENIX-4324
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-4324
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>
> The data conversion for fixed width column types is not being done for immutable indexes
leading to erroneous results when indexed columns are absent.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message