cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sylvain Lebresne (JIRA)" <>
Subject [jira] [Commented] (CASSANDRA-8959) More efficient frozen UDT and tuple serialization format
Date Thu, 12 Mar 2015 11:08:38 GMT


Sylvain Lebresne commented on CASSANDRA-8959:

bq. Shouldn't this just be a replication of whatever strategy we choose for encoding tables?

Not entirely sure I understand what "this" refers to exactly. But if you mean that the more
efficient we should use should try to be as close as possible to whatever we do in the sstable
format, then I agree (and what Aleksey describes is pretty close to things that CASSANDRA-8099
currently does as it happens).

> More efficient frozen UDT and tuple serialization format
> --------------------------------------------------------
>                 Key: CASSANDRA-8959
>                 URL:
>             Project: Cassandra
>          Issue Type: Improvement
>            Reporter: Aleksey Yeschenko
>              Labels: performance
>             Fix For: 3.1
> The current serialization format for UDTs has a fixed overhead of 4 bytes per defined
field (encoding the size of the field).
> It is inefficient for sparse UDTs - ones with many defined fields, but few of them present.
We could keep a bitset to indicate the missing fields, if any.
> It's sub-optimal for encoding UDTs with all the values present as well. We could use
varint encoding for the field sizes of blob/text fields and encode 'fixed' sized types directly,
without the 4-bytes size prologue.
> That or something more brilliant. Any improvement right now is lhf.

This message was sent by Atlassian JIRA

View raw message