impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems
Date Tue, 26 Apr 2016 20:18:12 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems
......................................................................


Patch Set 20:

(7 comments)

I've added placeholders for the changes I've made for the fix to IMPALA-3245, so that it's
easy to review them.

http://gerrit.cloudera.org:8080/#/c/2574/20/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 299:   output_partition->block_size = block_size;
Fix for IMPALA-3425.
Previously PARQUET_FILE_SIZE was ignored because GetFileBlockSize() always returned 32MB (the
default 'block' size in S3) for S3 files.

We now store the block size so that the GetFileBlockSize() returns the correct size which
is either the set by the user, or the default from core-site.xml.


Line 652:     *size = output_partition->block_size;
Added as part of fix for IMPALA-3245.


http://gerrit.cloudera.org:8080/#/c/2574/20/be/src/exec/hdfs-table-sink.h
File be/src/exec/hdfs-table-sink.h:

Line 89:   uint64_t block_size;
Added as part of fix for IMPALA-3245.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

Line 136:     sizes = self.filesystem_client.get_all_file_sizes(DIR)
Change made as part of fix for IMPALA-3245.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/filesystem_base.py
File tests/util/filesystem_base.py:

Line 58:   def get_all_file_sizes(self, path):
Added for IMPALA-3245.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/hdfs_util.py
File tests/util/hdfs_util.py:

Line 79:   def get_all_file_sizes(self, path):
Added for IMPALA-3245.


http://gerrit.cloudera.org:8080/#/c/2574/20/tests/util/s3_util.py
File tests/util/s3_util.py:

Line 72:   def get_all_file_sizes(self, path):
Added for IMPALA-3245.


-- 
To view, visit http://gerrit.cloudera.org:8080/2574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Gerrit-PatchSet: 20
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message