impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "David Knupp (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4365: Enabling end-to-end tests on a remote cluster
Date Mon, 07 Nov 2016 22:50:24 GMT
David Knupp has posted comments on this change.

Change subject: IMPALA-4365: Enabling end-to-end tests on a remote cluster
......................................................................


Patch Set 13:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4769/13//COMMIT_MSG
Commit Message:

Line 51: still problems to work out with the remote data load script itself.
> Did you try loading data on a remote cluster and running tests on in with t
Yes, many times. I should update this sentence to be more clear.

This is mainly a references to the several "clean up" changes that Harrison suggested earlier,
for which JIRA's have been opened. We can address those when there's time. More pressing than
cleaning up all those things is fact that we need to have this checked in order to validate
Impala running against a remote CDH 5.10 cluster, and time is getting short. We have less
than two weeks now.

There were some other actual problems that were mysterious to me initially. E.g., Kudu related
failures started appearing once recent Kudu changes were submitted -- until I realized that
this issue was breaking things: 

https://jira.cloudera.com/browse/OPSAPS-37322

But after tweaking the cluster, data loading works, and tests run -- though many tests may
need to be tweaked to work remotely.


http://gerrit.cloudera.org:8080/#/c/4769/13/bin/remote_data_load.py
File bin/remote_data_load.py:

Line 534:         sys.exit(1)
> In general, I think it's a bad practice to call sys.exit inside functions. 
OK, I'll move this. I'd seen this pattern used here in other scripts here (e.g., load-data.py
that we use for local data loading), so didn't know it was a frowned upon practice.


http://gerrit.cloudera.org:8080/#/c/4769/13/testdata/datasets/functional/schema_constraints.csv
File testdata/datasets/functional/schema_constraints.csv:

PS13, Line 120: Wide tables fail due to the SERDEPROPERTIES limits
> is this a new issue? Is it specific to remote data loading?
For our mini-cluster, we work around this problem here:

https://github.com/apache/incubator-impala/blob/master/bin/create-test-configuration.sh#L99

However, create-test-configuration.sh is part of our local build process. It doesn't get called
when CDH is deployed to a remote cluster. Besides, that script assumes that the metastore
database will always be postgres, which is not the case when testing against a remote cluster.

Before the change to this file, I had been using another hand-rolled script to configure the
property separately after deployment. With this, I can drop that step.

I've also tested the local data load after this change, and it's unaffected.


-- 
To view, visit http://gerrit.cloudera.org:8080/4769
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1f443a1728a1d28168090c6f54e82dec2cb073e9
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs7@hotmail.com>
Gerrit-Reviewer: Martin Grund <grundprinzip@gmail.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message