cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Milan Votava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-8675) COPY TO/FROM broken for newline characters
Date Fri, 09 Feb 2018 08:00:06 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358054#comment-16358054
] 

Milan Votava commented on CASSANDRA-8675:
-----------------------------------------

I have edited my post from Dec 9 to reflect all the changes we have made to copyutil.py (unicode_controlchars_re).
Using this "patch" we are able to export / import our tables as one will expect, imported
data are the same as the original one ('\n' is purple both in original and imported data).
Please check if you patched your copyutil.py as described and you use proper COPY options

> COPY TO/FROM broken for newline characters
> ------------------------------------------
>
>                 Key: CASSANDRA-8675
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8675
>             Project: Cassandra
>          Issue Type: Bug
>         Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native protocol
v3]
> Ubuntu 14.04 64-bit
>            Reporter: Lex Lythius
>            Priority: Major
>              Labels: cqlsh
>             Fix For: 2.1.3
>
>         Attachments: copytest.csv
>
>
> Exporting/importing does not preserve contents when texts containing newline (and possibly
other) characters are involved:
> {code:sql}
> cqlsh:test> create table if not exists copytest (id int primary key, t text);
> cqlsh:test> insert into copytest (id, t) values (1, 'This has a newline
>         ... character');
> cqlsh:test> insert into copytest (id, t) values (2, 'This has a quote " character');
> cqlsh:test> insert into copytest (id, t) values (3, 'This has a fake tab \t character
(typed backslash, t)');
> cqlsh:test> select * from copytest;
>  id | t
> ----+---------------------------------------------------------
>   1 |                           This has a newline\ncharacter
>   2 |                            This has a quote " character
>   3 | This has a fake tab \t character (entered slash-t text)
> (3 rows)
> cqlsh:test> copy copytest to '/tmp/copytest.csv';
> 3 rows exported in 0.034 seconds.
> cqlsh:test> copy copytest from '/tmp/copytest.csv';
> 3 rows imported in 0.005 seconds.
> cqlsh:test> select * from copytest;
>  id | t
> ----+-------------------------------------------------------
>   1 |                          This has a newlinencharacter
>   2 |                          This has a quote " character
>   3 | This has a fake tab \t character (typed backslash, t)
> (3 rows)
> {code}
> I tried replacing \n in the CSV file with \\n, which just expands to \n in the table;
and with an actual newline character, which fails with error since it prematurely terminates
the record.
> It seems backslashes are only used to take the following character as a literal
> Until this is fixed, what would be the best way to refactor an old table with a new,
incompatible structure maintaining its content and name, since we can't rename tables?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@cassandra.apache.org
For additional commands, e-mail: commits-help@cassandra.apache.org


Mime
View raw message