impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems
Date Wed, 06 Apr 2016 00:18:53 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-1878: Support INSERT and LOAD DATA on S3 and between filesystems
......................................................................


Patch Set 10:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/2574/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 839: There is no directory structure in S3, so "inheriting" permissions is not
        :           // possible.
> S3 has per-object permissions (in the form of ACLs see http://docs.aws.amaz
That is possible, but since there is no directory structure, we don't have anywhere to inherit
permissions from.

An alternative way of doing it would be to save the partition's default permissions in a hidden
file or something similar (just basically someplace to hold the state), and every time we
insert, we "inherit" permissions from this file.
Or maybe just stat an existing object and copy over those permissions (this could have other
complications though).

I'll leave this as a TODO for now and address it in a following patch.


Line 875: stringstream ss;
        :           ss << "Error(s) deleting partition directories. First error (of
"
        :              << partition_create_ops.errors().size() << ") was: " <<
err.second;
> While you are here, could you clean up he use of stringstream in favour of 
Done


http://gerrit.cloudera.org:8080/#/c/2574/10/be/src/util/hdfs-bulk-ops.cc
File be/src/util/hdfs-bulk-ops.cc:

Line 111: .c_str()
> not really your change, but I think you can get rid of .c_str() in all the 
Done


Line 128:     string error_msg = connection_status.ok() ? GetStrErrMsg() :
> put the line break here, and put the whole ternary statement on the next li
Done


Line 156:   if (connection_cache != NULL) connection_cache_ = connection_cache;
> You should do this even if connection_cache is NULL - the caller might be t
Done


http://gerrit.cloudera.org:8080/#/c/2574/10/be/src/util/hdfs-bulk-ops.h
File be/src/util/hdfs-bulk-ops.h:

Line 94:   /// Constructs a new operation set.
> comment is now redundant, can be removed.
Done


Line 109:   /// If abort_on_error is true, execution will finish after the first error seen.
> update comment to explain connection_cache
Done


Line 121:     return connection_cache_;
> one line
Done


Line 130:   Promise<bool> promise_;
> Can you replace this with a CountingBarrier? I think that will be more natu
I think the reason a promise was used here instead of a counting barrier was because a CountingBarrier
cannot return an error. And we need the promise to notify us of an error if it does happen.

I can modify CountingBarrier to maintain its current behavior and optionally return an error
if necessary. Should I do that instead?


http://gerrit.cloudera.org:8080/#/c/2574/10/be/src/util/hdfs-util.cc
File be/src/util/hdfs-util.cc:

Line 77: // TODO: 's3a' is the smallest scheme that our filesystem supports, so we compare
the
       :   // first 6 characters. Change if we support a filesystem with a smaller scheme.
> Comment is out-of-date
Done


Line 85:   // HDFS path.
> this isn't a valid assumption after Anuj's change.
Once Anuj pushes his change, I'll rebase and submit the final version. (Or he can rebase if
I get this patch in first)


http://gerrit.cloudera.org:8080/#/c/2574/10/be/src/util/hdfs-util.h
File be/src/util/hdfs-util.h:

Line 45: IsDfsPath
> I think this should be IsHdfsPath(...), because other filesystems technical
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2574
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94e15ad67752dce21c9b7c1dced6e114905a942d
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tbobrovytsky@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message