cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-11646) SSTableWriter output discrepancy
Date Tue, 26 Apr 2016 10:26:12 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15257864#comment-15257864
] 

Stefania commented on CASSANDRA-11646:
--------------------------------------

The size difference was due to delta time encoding, the BB generated by either {{TypeCodec}}
or {{TypeSerializer}} are indeed identical. However, {{TypeCodec}} was taking twice as long
to serialize values due to the time taken to convert a column spec into a {{TypeCodec}}. I've
cached the type codecs and now the time and output size are the same as for {{TypeSerializer}}.

||trunk||
|[patch|https://github.com/stef1927/cassandra/commits/11646]|
|[testall|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11646-testall/]|
|[dtest|http://cassci.datastax.com/view/Dev/view/stef1927/job/stef1927-11646-dtest/]|

CI pending. 

Note: we should probably remove the unit test before committing, since it takes about 25 seconds
to run. I've left it there for reviewing or further debugging.


> SSTableWriter output discrepancy
> --------------------------------
>
>                 Key: CASSANDRA-11646
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11646
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: T Jake Luciani
>            Assignee: Stefania
>             Fix For: 3.6
>
>
> Since CASSANDRA-10624 there is a non-trivial difference in the size of the output in
CQLSSTableWriter.
> I've written the following code:
> {code}
>  String KS = "cql_keyspace";
>         String TABLE = "table1";
>         File tempdir = Files.createTempDir();
>         File dataDir = new File(tempdir.getAbsolutePath() + File.separator + KS + File.separator
+ TABLE);
>         assert dataDir.mkdirs();
>         String schema = "CREATE TABLE cql_keyspace.table1 ("
>                         + "  k int PRIMARY KEY,"
>                         + "  v1 text,"
>                         + "  v2 int"
>                         + ");";// with compression = {};";
>         String insert = "INSERT INTO cql_keyspace.table1 (k, v1, v2) VALUES (?, ?, ?)";
>         CQLSSTableWriter writer = CQLSSTableWriter.builder()
>                                                   .sorted()
>                                                   .inDirectory(dataDir)
>                                                   .forTable(schema)
>                                                   .using(insert).build();
>         for (int i = 0; i < 10000000; i++)
>             writer.addRow(i, "test1", 24);
>         writer.close();
> {code}
> Pre CASSANDRA-10624 the data file is ~63MB. Post it's ~69MB



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message