cassandra-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aleksey Yeschenko (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CASSANDRA-7018) cqlsh failing to handle utf8 decode errors in certain cases
Date Fri, 18 Apr 2014 22:44:16 GMT

    [ https://issues.apache.org/jira/browse/CASSANDRA-7018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13974587#comment-13974587
] 

Aleksey Yeschenko commented on CASSANDRA-7018:
----------------------------------------------

+1

> cqlsh failing to handle utf8 decode errors in certain cases
> -----------------------------------------------------------
>
>                 Key: CASSANDRA-7018
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7018
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Tools
>            Reporter: Colin B.
>            Assignee: Mikhail Stepura
>            Priority: Minor
>              Labels: cqlsh
>             Fix For: 1.2.17, 2.0.8, 2.1 beta2
>
>         Attachments: cassandra-1.2-7018.patch
>
>
> Under certain circumstances when a row to be returned in cqlsh contains a utf8 decoding
error, no results will be printed. It seems that certain types of where clauses cause the
decoding to fail.
> Preparation:
> {code}
> [cqlsh 3.1.8 | Cassandra 1.2.16 | CQL spec 3.0.0 | Thrift protocol 19.36.2]
> ...
> cqlsh:ks> CREATE TABLE test_utf8 ( a text PRIMARY KEY, b text );
> cqlsh:ks> INSERT INTO test_utf8 (a, b) VALUES (blobAsText(0x3031f6393130), '0110');
> cqlsh:ks> select * from test_utf8;
>  a           | b
> -------------+------
>  '01\xf6910' | 0110
> Failed to decode value '01\xf6910' (for column 'a') as text: 'utf8' codec can't decode
byte 0xf6 in position 2: invalid start byte
> cqlsh:ks>
> {code}
> Actual Results:
> {code}
> cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);
> value '01\xf6910' (in col 'a') can't be deserialized as text: 'utf8' codec can't decode
byte 0xf6 in position 2: invalid start byte
> cqlsh:ks>
> {code}
> Expected Results:
> {code}
> cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);
>  a           | b
> -------------+------
>  '01\xf6910' | 0110
> Failed to decode value '01\xf6910' (for column 'a') as text: 'utf8' codec can't decode
byte 0xf6 in position 2: invalid start byte
> cqlsh:ks>
> {code}
> Traceback with cqlsh --debug:
> {code}
> cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);
> Traceback (most recent call last):
>   File "bin/cqlsh", line 942, in onecmd
>     self.handle_statement(st, statementtext)
>   File "bin/cqlsh", line 982, in handle_statement
>     return self.handle_parse_error(cmdword, tokens, parsed, srcstr)
>   File "bin/cqlsh", line 991, in handle_parse_error
>     return self.perform_statement(cqlruleset.cql_extract_orig(tokens, srcstr))
>   File "bin/cqlsh", line 1031, in perform_statement
>     with_default_limit=with_default_limit)
>   File "bin/cqlsh", line 1059, in perform_statement_untraced
>     self.print_result(self.cursor, with_default_limit)
>   File "bin/cqlsh", line 1111, in print_result
>     self.print_static_result(cursor)
>   File "bin/cqlsh", line 1141, in print_static_result
>     formatted_values = [map(self.myformat_value, self.decode_row(cursor, row), cursor.column_types)
for row in cursor.result]
>   File "bin/cqlsh", line 590, in decode_row
>     values.append(cursor.decoder.decode_value(val, vtype, nameinfo[0]))
>   File "bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", line 54, in
decode_value
>     vtype.cql_parameterized_type())
>   File "bin/../lib/cql-internal-only-1.4.1.zip/cql-1.4.1/cql/decoders.py", line 33, in
value_decode_error
>     % (valuebytes, namebytes, expectedtype, err))
> ProgrammingError: value '01\xf6910' (in col 'a') can't be deserialized as text: 'utf8'
codec can't decode byte 0xf6 in position 2: invalid start byte
> cqlsh:ks>
> {code}
> Problematic statements:
> {code}
> cqlsh:ks> select * from test_utf8 where token(a) > 0;
> cqlsh:ks> select * from test_utf8 where a in (blobAsText(0x3031f6393130));
> cqlsh:ks> select * from test_utf8 where a = blobAsText(0x3031f6393130);
> {code}
> Statements that are ok:
> {code}
> cqlsh:ks> select * from test_utf8 where token(a) < token('qwer');
> cqlsh:ks> select * from test_utf8;
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message