cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Petrov (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (CASSANDRA-10857) Allow dropping COMPACT STORAGE flag from tables in 3.X
Date Fri, 17 Jun 2016 07:52:05 GMT

     [ https://issues.apache.org/jira/browse/CASSANDRA-10857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Alex Petrov updated CASSANDRA-10857:
------------------------------------
    Status: Patch Available  (was: Open)

I've prepared a patch to drop {{COMPACT STORAGE}} flag. After discussing it with [~slebresne],
this part is just a naïve implementation that leaves all the columns intact and only drops
the flag. However, we should think the upgrade / migration paths through.

I went with the simplest way, just by using a property called {{force_non_compact}} that can
only be set to {{true}} and on compact tables, which makes the operation irreversible (would
be great to hear comments on whether it's desired or not, we might reverse it if there were
no incompatible changes made). Property will change the flags ({{dense}}, {{super}} and {{compound}})
directly). For safety, we could add a {{non_compact_forced}} flag that would take a part in
validation process, so that flags could be changed only under these circumstances. The initial
version of the patch, rather than modifying flags as it does now, was just adding {{force_non_compact}}
as a table attribute, which was only used to check for whether the table is compact or not,
which was leaving the flags intact. Do we want a special syntax for that, such as {{WITHOUT
COMPACT STORAGE}} or similar?

Currently, the most pressing issues I've noticed are:

  * (1) When the table is created without value columns {{CREATE TABLE %s (pkey ascii, ckey
ascii, PRIMARY KEY (pkey, ckey)) WITH COMPACT STORAGE}}, the {{value}} column has {{EmptyType}},
which might not be very useful.
  * (2) When the {{comparatorType}} is set to anything but string, returned column names will
be converted to their byte representation in {{AbstractType#toString}}, which, depending on
the datatype is unintuitive to represent. Default {{ByteType}}, would take a {{bytesToHex}}
representation. So the end result would look something like {{row(key='key1', column1=null,
574141414154=100, value=null)}}
  * (3) Generally, any unset type would default to {{BytesType}}, whether the table was created
from thrift or via CQL with {{COMPACT STORAGE}} (values of {{value}} and {{column%n}} columns
would also be bytes by default).
  * (4) If key was composite, it would expand to multiple clustering columns
  * (5) With {{super}} column families, we'll get the name of the supercolumn as an empty
string.

So we may want to provide a way to migrate the data. At least (5) could be addressed quite
simply by renaming the column. (2) is harder as data access patters might have been very different.
For the rest of things, we may document the possible pitfalls.

There might be some more things that I have missed.

|[dtest patch|https://github.com/ifesdjeen/cassandra-dtest/tree/10857-trunk] |[trunk|https://github.com/ifesdjeen/cassandra/tree/10857-trunk]
|[utest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-10857-trunk-testall/]|[dtest|https://cassci.datastax.com/view/Dev/view/ifesdjeen/job/ifesdjeen-10857-trunk-dtest/]|

> Allow dropping COMPACT STORAGE flag from tables in 3.X
> ------------------------------------------------------
>
>                 Key: CASSANDRA-10857
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-10857
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: CQL, Distributed Metadata
>            Reporter: Aleksey Yeschenko
>            Assignee: Alex Petrov
>             Fix For: 3.x
>
>
> Thrift allows users to define flexible mixed column families - where certain columns
would have explicitly pre-defined names, potentially non-default validation types, and be
indexed.
> Example:
> {code}
> create column family foo
>     and default_validation_class = UTF8Type
>     and column_metadata = [
>         {column_name: bar, validation_class: Int32Type, index_type: KEYS},
>         {column_name: baz, validation_class: UUIDType, index_type: KEYS}
>     ];
> {code}
> Columns named {{bar}} and {{baz}} will be validated as {{Int32Type}} and {{UUIDType}},
respectively, and be indexed. Columns with any other name will be validated by {{UTF8Type}}
and will not be indexed.
> With CASSANDRA-8099, {{bar}} and {{baz}} would be mapped to static columns internally.
However, being {{WITH COMPACT STORAGE}}, the table will only expose {{bar}} and {{baz}} columns.
Accessing any dynamic columns (any column not named {{bar}} and {{baz}}) right now requires
going through Thrift.
> This is blocking Thrift -> CQL migration for users who have mixed dynamic/static column
families. That said, it *shouldn't* be hard to allow users to drop the {{compact}} flag to
expose the table as it is internally now, and be able to access all columns.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message