impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4047: DO NOT SUBMIT Highlight all occurrences of CDH/cdh
Date Wed, 31 Aug 2016 22:26:36 GMT
Alex Behm has posted comments on this change.

Change subject: IMPALA-4047: DO NOT SUBMIT Highlight all occurrences of CDH/cdh
......................................................................


Patch Set 1:

(15 comments)

Had a pass to get an understanding of the various issues. Most changes look ok.

For those issues that we cannot resolve today (e.g., cdh dependencies, cdh-specific scripts
for starting up cluster), I think we should file JIRAs to track them, but not fix them.

http://gerrit.cloudera.org:8080/#/c/4187/1/.gitignore
File .gitignore:

Line 44: xxx-*-hdfs-data/
any idea what this is used for? remove?


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/exec/hbase-table-scanner.cc
File be/src/exec/hbase-table-scanner.cc:

Line 170
> We can replace this with "Assume an older HBase version"
I think we can rephrase with the upstream version 0.95.2


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/exec/parquet-metadata-utils.h
File be/src/exec/parquet-metadata-utils.h:

Line 64
> We could actually use xxx here, or 1.2-foo5 or similar
Imo this information is useful because we will actually find such a string in data files.


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/exec/parquet-version-test.cc
File be/src/exec/parquet-version-test.cc:

Line 49
> all these could be kept since they are arbitrary strings here, however we c
Imo leaving the cdh here is fine because these are versions strings in actual Parquet files.


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 3453
> Do we want to keep this?
I'm generally in favor of filing the upstream JIRA. Yes, the link is somewhat broken but we
can add a comment to the CDH JIRA that mentions the upstream JIRA, so it's at least kind of
searchable.

Before you go ahead and do this, let's collect a list of these issues to see how many there
are and of what nature.


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/scheduling/simple-scheduler.h
File be/src/scheduling/simple-scheduler.h:

Line 59
> Should be replace with the version of Impala that will allow a break of com
How about removing "CDH6" and replacing it with (incompatible change)


PS1, Line 359: 
> example hostname, can be kept.
sounds good


http://gerrit.cloudera.org:8080/#/c/4187/1/be/src/util/debug-util-test.cc
File be/src/util/debug-util-test.cc:

Line 64
> I don't know what to do here. It's a valid test for a feature we support. H
Afaik it should be fine to leave this. Having tests for CM integration seems valid to me.
Maybe it's better to mention the CM version instead of CDH version though.


http://gerrit.cloudera.org:8080/#/c/4187/1/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

> For this script I don't know what the plan forward is. Will it be reworked?
Imo we should leave it for the time being. We are still depending on CDH versions and not
upstream versions of our dependencies. Once we change that, we should reconsider changing
these params.


http://gerrit.cloudera.org:8080/#/c/4187/1/bin/build_thirdparty.sh
File bin/build_thirdparty.sh:

Line 46
I think we can remove this entire file from asf/master since there is no thirdparty to build
anymore.

Can you confirm with Michael Ho?


http://gerrit.cloudera.org:8080/#/c/4187/1/bin/dump_breakpad_symbols.py
File bin/dump_breakpad_symbols.py:

PS1, Line 41: 
> These are example filenames and the files are publicly available. I would k
Agreed.


http://gerrit.cloudera.org:8080/#/c/4187/1/bin/load-data.py
File bin/load-data.py:

PS1, Line 88: 
> Another issue without an upstream one. What to do?
Collect them all into a spreadsheet so we can get an overview and decide on a resolution policy.


http://gerrit.cloudera.org:8080/#/c/4187/1/bin/start-impala-cluster.py
File bin/start-impala-cluster.py:

Line 220:   relative_fs_cfg_path = 'xxx%s/node-%d/etc/hadoop/conf/fair-scheduler.xml' %\
This is kind of annoying. Maybe we should rework our scripts to not even have this directory.
It stems from a time where we were in between cdh4 and cdh5.


http://gerrit.cloudera.org:8080/#/c/4187/1/ext-data-source/api/pom.xml
File ext-data-source/api/pom.xml:

Line 38
these will need to stay for the time being


http://gerrit.cloudera.org:8080/#/c/4187/1/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

Line 266
> looks like a reference to an external dependency of sorts, I'd keep it
agree


-- 
To view, visit http://gerrit.cloudera.org:8080/4187
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Icb37e2ef0cd9fa0e581d359c5dd3db7812b7b2c8
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message