impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2523: Make HdfsTableSink aware of clustered input
Date Mon, 31 Oct 2016 16:33:48 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2523: Make HdfsTableSink aware of clustered input
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4863/5/be/src/exec/hdfs-table-sink.cc
File be/src/exec/hdfs-table-sink.cc:

Line 277:       RETURN_IF_ERROR(FinalizePartitionFile(state, output_partition));
Can avoid a level of nesting with if (!new_file) break;


Line 294:     GetHashTblKey(dynamic_partition_key_expr_ctxs_, &key);
Oh man, I didn't notice this 'current_row_' thing earlier. It looks like for some reason the
original author of the code didn't want to pass an extra argument into functions so instead
did it implicitly via 'current_row_'

Can you refactor it so that it's passed as an actual argument?


Line 302:         RETURN_IF_ERROR(WriteRowsOfPartition(state, batch, current_partition_pair));
I think we should also remove the entry in partition_keys_to_output_partitions_ here so that
we use O(1) memory.


http://gerrit.cloudera.org:8080/#/c/4863/4/tests/query_test/test_insert_behaviour.py
File tests/query_test/test_insert_behaviour.py:

Line 476: 
> Done. It takes 30 seconds on my machine. Too much for core?
Maybe a little much given core already takes too long - I think it's useful coverage but we're
also probably pretty unlikely to break it. Seems fine for exhaustive.


-- 
To view, visit http://gerrit.cloudera.org:8080/4863
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibeda0bdabbfe44c8ac95bf7c982a75649e1b82d0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message