cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christophe (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8675) COPY TO/FROM broken for newline characters
Date Tue, 11 Apr 2017 11:36:41 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15964228#comment-15964228
] 

Christophe commented on CASSANDRA-8675:
---------------------------------------

This COPY FROM issue should be considered a real bug.

If someone runs COPY TO followed by COPY FROM, it is reasonable to expect that the data loaded
should exactly matched the data extracted. But because if this issue, that's not the case
when the original data contains string with escaped characters.

> COPY TO/FROM broken for newline characters
> ------------------------------------------
>
>                 Key: CASSANDRA-8675
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol
v3]
> Ubuntu 14.04 64-bit
>            Reporter: Lex Lythius
>              Labels: cqlsh
>             Fix For: 2.1.3
>
>         Attachments: copytest.csv
>
>
> Exporting/importing does not preserve contents when texts containing newline (and possibly
other) characters are involved:
> {code:sql}
> cqlsh:test> create table if not exists copytest (id int primary key, t text);
> cqlsh:test> insert into copytest (id, t) values (1, 'This has a newline
>         ... character');
> cqlsh:test> insert into copytest (id, t) values (2, 'This has a quote " character');
> cqlsh:test> insert into copytest (id, t) values (3, 'This has a fake tab \t character
(typed backslash, t)');
> cqlsh:test> select * from copytest;
>  id | t
> ----+---------------------------------------------------------
>   1 |                           This has a newline\ncharacter
>   2 |                            This has a quote " character
>   3 | This has a fake tab \t character (entered slash-t text)
> (3 rows)
> cqlsh:test> copy copytest to '/tmp/copytest.csv';
> 3 rows exported in 0.034 seconds.
> cqlsh:test> copy copytest from '/tmp/copytest.csv';
> 3 rows imported in 0.005 seconds.
> cqlsh:test> select * from copytest;
>  id | t
> ----+-------------------------------------------------------
>   1 |                          This has a newlinencharacter
>   2 |                          This has a quote " character
>   3 | This has a fake tab \t character (typed backslash, t)
> (3 rows)
> {code}
> I tried replacing \n in the CSV file with \\n, which just expands to \n in the table;
and with an actual newline character, which fails with error since it prematurely terminates
the record.
> It seems backslashes are only used to take the following character as a literal
> Until this is fixed, what would be the best way to refactor an old table with a new,
incompatible structure maintaining its content and name, since we can't rename tables?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message