cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stefania (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CASSANDRA-11030) utf-8 characters incorrectly displayed/inserted on cqlsh on Windows
Date Tue, 02 Feb 2016 03:11:39 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-11030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15127558#comment-15127558
] 

Stefania edited comment on CASSANDRA-11030 at 2/2/16 3:10 AM:
--------------------------------------------------------------

You are correct, it finally works. I think I inserted the data initially by copy and paste
in a git bash terminal (launched via ConEmu), the only one where I could paste a unicode character,
but for this terminal the default encoding was cp1252 since I only worked out today how to
change it to cp65001. So even if I inserted the data with --encoding=UTF-8 it would have probably
caused problems. From other terminals (command prompt, power shell) I could not paste the
character into cqlsh and trying to insert something like u'\uXXXX' would give a syntax error.


The following works however (unicode.cql is encoded with utf-8):

{code}
chcp 65001
C:\Users\stefania\git\cstar\cassandra>type unicode.cql
INSERT INTO test.test (val) VALUES ('não');
C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 --file=unicode.cql
C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.2.5-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> select * from test.test;

 val
-----
 não
{code}

The source command also works *provided the encoding specified via the command line is the
same as the file encoding*, otherwise we get a missing character glyph (a square). 

Inserting the character directly from git bash also works now, but because I changed the code
page to 65001 for it, otherwise it causes the original problem.

You are probably right regarding changing default encoding, I'm + 1 to change it to 'utf-8'
if you want. Also, shouldn't {{do_source}} use the same encoding as the file encoding? I think
we should also stress that whichever terminal people are using on Windows, it should have
the same encoding as the one used by cqlsh.

We can commit this ticket as is and open a new ticket re. default encoding or change it here,
up to you.


was (Author: stefania):
You are correct, it finally works. I think I inserted the data initially by copy and paste
in a git bash terminal (launched via ConEmu), the only one where I could paste a unicode character,
but for this terminal the default encoding was cp1252 since I only worked out today how to
change it to cp65001. So even if I inserted the data with --encoding=UTF-8 it would have probably
caused problems. From other terminals (command prompt, power shell) I could not paste the
character into cqlsh and trying to insert something like u'\uXXXX' would give a syntax error.


The following works however (unicode.cql is encoded with utf-8):

{code}
chcp 65001
C:\Users\stefania\git\cstar\cassandra>type unicode.cql
INSERT INTO test.test (val) VALUES ('não');
C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8 --file=unicode.cql
C:\Users\stefania\git\cstar\cassandra>bin\cqlsh.bat --encoding=UTF-8
Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.2.5-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
Use HELP for help.
cqlsh> select * from test.test;

 val
-----
 não
{code}

The source command also works *provided the encoding specified via the command line is the
same as the file encoding*, otherwise we get a missing character glyph (a square). 

Inserting the character directly from git bash also works now, but because I changed the code
page to 65001 for it, otherwise it causes the original problem.

You are probably right regarding changing default encoding, I'm + 1 to change it to 'utf-8'
if you want. Also, shouldn't {{do_source}} use the same encoding as the file encoding? I think
we should also stress that whichever terminal people are using on Windows, it should have
the same encoding as the one used by cqlsh.

We can commit this ticket as it and open a new ticket re. default encoding or change it here,
up to you.

> utf-8 characters incorrectly displayed/inserted on cqlsh on Windows
> -------------------------------------------------------------------
>
>                 Key: CASSANDRA-11030
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-11030
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Paulo Motta
>            Assignee: Paulo Motta
>            Priority: Minor
>              Labels: cqlsh, windows
>
> {noformat}
> C:\Users\Paulo\Repositories\cassandra [2.2-10948 +6 ~1 -0 !]> .\bin\cqlsh.bat --encoding
utf-8
> Connected to test at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 2.2.4-SNAPSHOT | CQL spec 3.3.1 | Native protocol v4]
> Use HELP for help.
> cqlsh> INSERT INTO bla.test (bla ) VALUES  ('não') ;
> cqlsh> select * from bla.test;
>  bla
> -----
>  n?o
> (1 rows)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message