impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5799: Kudu DML can crash if schema has changed
Date Wed, 16 Aug 2017 20:42:02 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5799: Kudu DML can crash if schema has changed
......................................................................


Patch Set 1:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7688/1//COMMIT_MSG
Commit Message:

PS1, Line 14: its
it's


PS1, Line 18: DML
DML is really a SQL concept, technically this should be 'WriteOp'


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS1, Line 144: output_expr_evals_.size()
We can't assume this is true. E.g. see the comment in Send().  We actually don't necessarily
have the full Table descriptor in the BE (sorry I didn't think of that when we first started
talking about this JIRA), so we could either (a) ship that from the FE to the BE or (b) just
check the cols that are referenced. I think the latter makes sense, and that you can do something
like this:

      // output_expr_evals_ only contains the columns that the op
      // applies to, i.e. columns explicitly mentioned in the query, and
      // referenced_columns is then used to map to actual column positions.
      int col = kudu_table_sink_.referenced_columns.empty() ?
          j : kudu_table_sink_.referenced_columns[j];


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

PS1, Line 37: DataType = 
why can't you just have
using kudu::client::...::DataType; as we do above?


http://gerrit.cloudera.org:8080/#/c/7688/1/be/src/exec/kudu-util.h
File be/src/exec/kudu-util.h:

Line 81: ColumnType KuduDataTypeToColumnType(kudu::client::KuduColumnSchema::DataType type);
comment


http://gerrit.cloudera.org:8080/#/c/7688/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

Line 399:   def test_concurrent_schema_change(self, cursor, unique_database):
is this test reproducing a crash before your change?


Line 410:         client.execute("insert into %s values (0, 0), (1, 1)" % table_name)
I was thinking we could do these operations in a loop to get more opportunities to hit issues.


PS1, Line 416:       try:
             :         client = self.create_impala_client()
             :         client.execute("alter table %s drop column col1" % table_name)
             :         client.execute("alter table %s add columns (col1 string)" % table_name)
             :       except Exception as e:
             :         threading.current_thread().error = e
this you can do on the main thread while the insert thread is inserting


PS1, Line 432: 'Kudu error(s) reported'
this is the error in the case we sent a write op to Kudu that was w/ the old schema? Is there
any more of the error message we can match? This looks very generic.


-- 
To view, visit http://gerrit.cloudera.org:8080/7688
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9fd6bf164310df0041144f75f5ee722665e9f587
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message