impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Taras Bobrovytsky (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-2904: Support INSERT and LOAD DATA on S3 and between filesystems
Date Tue, 22 Mar 2016 02:24:32 GMT
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-2904: Support INSERT and LOAD DATA on S3 and between filesystems
......................................................................


Patch Set 6:

(19 comments)

http://gerrit.cloudera.org:8080/#/c/2574/6/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

Line 118:     if IS_S3:
Much shorter and clearer:
cls.filesystem_client = S3Client(S3_BUCKET_NAME) if IS_S3 else cls.hdfs_client


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/query_test/test_insert_behaviour.py
File tests/query_test/test_insert_behaviour.py:

Line 110:     ls = self.filesystem_client.ls(part_dir)
I don't think we need to define the ls variable here, just assert directly


Line 358:       does_exist = self.filesystem_client.exists(path)
how about:
assert self.filesystem_client.exists(path) == should_exist


Line 398:     assert new_ls == ls
assert ls == new_ls == 1


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/filesystem_base.py
File tests/util/filesystem_base.py:

Line 25: file_data
is file_data a string?


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/hdfs_util.py
File tests/util/hdfs_util.py:

Line 59:   def make_dir(self, path, permission=755):
Do we need to override this? Is the only purpose to set the default permission to 755? Is
this the default for the parent class?


Line 102:   def ls(self, path):
Can you add a more detailed comment? What is the difference between this method and list_dir?
Maybe describe the use case for this because it's not obvious to me what the purpose of this
method is.


http://gerrit.cloudera.org:8080/#/c/2574/6/tests/util/s3_util.py
File tests/util/s3_util.py:

Line 15: # S3 access utilities
This comment does not really add anything.


Line 22:   @classmethod
Add an empty line above this line.


Line 30:     if not overwrite:
if not override and self.exists(filename): return False


Line 33: response
What's the point of this variable, it just disappears? I think we should return true only
if the operation succeeded


Line 37: othav
add space


Line 57: =
path += '/'


Line 69: both
This should be called files_and_dirs


Line 70:     for d in dirs:
How about something like:
files_and_dirs.extend([d.split('/')[-2] for d in dirs])


Line 75:       key = tmp[-1]
how about simply:
key = f.split('/')[-1]

This avoids creating the unnecessary tmp variable.


Line 83:     return contents is not None
how about return response.get('Contents') is not None
to avoid creating an extra variable


Line 90:       keys = self.list_keys(path)
objects = [{'Key': k} for k in self.list_keys(path)] if recursive else path


Line 92:         objects.append({'Key':k})
Do we have to add path to the list as well in the recursive case?


-- 
To view, visit http://gerrit.cloudera.org:8080/2574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message