cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-10216) Remove target type from internal index metadata
Date Mon, 14 Sep 2015 12:31:46 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743445#comment-14743445
] 

Sylvain Lebresne commented on CASSANDRA-10216:
----------------------------------------------

bq. it seemed cleaner & possibly more future-proof to be explicit about the target type
in all cases

Thing is, I'm not really sure the target type is all that future proof in the first place
(or really, the whole {{IndexTarget}} class). We'll want to support functional index shortly,
and there is no reason we won't want to support stuffs like {{CREATE INDEX ON t(concat(trim(firstname),
trim(lastname)))}}, which will require serious update to {{IndexTarget}} and there is no reason
to think {{IndexTarget.Type}} will still exist (at least in it's current form: once we more
generally support function, it will hopefully be possible to handle the collection variants
as just normal function calls in particular). Anyway, that's why I don't want to focus on
particularly matching the current implementation (it will change, it always does), but rather
on the externally visible parts which is what the "target" value is. Hence why I want to match
the user statement.

bq.  this is also what happens in the code, a statement like {{CREATE INDEX ON ks.t1(col)}}
produces an IndexTarget with type {{VALUES}}

Well, since you mention it, I would have a slight preference for actually using another "type"
for that ({{REGULAR}}, {{NONE}}, {{SIMPLE}}, whatever). But assuming we go with my previous
point, it's really just a minor implementation detail so I don't care too much.

bq. the plurality of "values" doesn't seem particularly weird on a non-collection column to
me, the index is on the values of that column, of which there is only 1 per-row but still
many per-index

Hehe. You still have to use a slightly different definition to make the term match both collections
and regular columns, or at least stick to a somewhat vaguer definition, which to me has a
bad smell to it. I didn't meant that as a very important point though.

bq. How do you want to handle indexes on collection values?

The fact that "values" is the default for collection is an historical accident. We didn't
really foresee at the time we'd have different parts to index, but in hindsight, the default
should be to index the whole collection (instead, we had to introduce the ugly but necessary
{{full}} thing). Anyway, we're not gonna change this now, but my preference would be to add
support for {{CREATE INDEX ON t(values(myCollection))}} and have the "default" be considered
as a shortcut for that. At which point, we'd obviously use "values" for collection values
but that would still be all consistent.

In any case, the main property I want to get here is that one can rebuild the {{CREATE INDEX}}
statement from the schema table without requiring any parsing of {{target}}, so my opinion
is that either:
# we're fine with my proposal above of adding support for "values(myCollection)" as equivalent
to just "myCollection", and then we (always) use "values(myCollection)" in the target.
# or there is strong opposition to that proposal for some reason, and then I'd rather omit
the "values" from the target (since having it wouldn't then be valid CQL syntax).

My own preference is obviously 1.

> Remove target type from internal index metadata
> -----------------------------------------------
>
>                 Key: CASSANDRA-10216
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10216
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Sam Tunnicliffe
>            Assignee: Sam Tunnicliffe
>              Labels: client-impacting
>             Fix For: 3.0.0 rc1
>
>
> As part of CASSANDRA-6716 & in anticipation of CASSANDRA-10124, a distinction was
introduced between secondary indexes which target a fixed set of 1 or more columns in the
base data, and those which are agnostic to the structure of the underlying rows. This distinction
is manifested in {{IndexMetadata.targetType}} and {{system_schema.indexes}}, in the {{target_type}}
column. It could be argued that this distinction complicates the codebase without providing
any tangible benefit, given that the target type is not actually used anywhere.
> It's only the impact on {{system_schema.indexes}} that makes puts this on the critical
path for 3.0, any code changes are just implementation details. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message