cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CASSANDRA-9796) Give 8099's like treatment to partition keys
Date Tue, 14 Jul 2015 08:35:04 GMT
Sylvain Lebresne created CASSANDRA-9796:
-------------------------------------------

             Summary: Give 8099's like treatment to partition keys
                 Key: CASSANDRA-9796
                 URL: https://issues.apache.org/jira/browse/CASSANDRA-9796
             Project: Cassandra
          Issue Type: Improvement
            Reporter: Sylvain Lebresne


Post-8099, we properly distinguish clustering columns at the engine level, which allows use
somewhat more efficient encoding: we don't write the size of values of fixed width types,
and we can properly store null values (which will likely prove useful for CASSANDRA-6477 for
instance).

Partition keys however have had no such love: the storage engine still manipulate them like
a single blob and their encoding is not terribly efficient: we always store the size of every
values (even fixed width ones) and for compound values we even store the size of the full
partition key even though it's redundant with the individual value sizes. The encoding also
don't allow nulls, which is inconvenient at least for CASSANDRA-6477.

So I'd like to improve on this by:
# making the {{DecoratedKey}} API (which I'd personally rename into {{PartitionKey}}) expose
the fact that we can have more than one value.  Typically by adding {{size()}} and {{get\(i\)}}
methods like for {{Clustering}}.  This would simplify a couple of places in the code where
we still manually decompose such values in particular.
# improve their encoding. An easy/consistent solution for that would be reuse the same encoding
than for {{Clustering}} (they are the same kind of beast), though I'm open to other options.

One small subtlety to be aware of is that whatever we do to the internal encoding/implementation,
we must make sure we still compute the same tokens.  But that's not particularly hard either.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message