impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Behm (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-6070: Parallel data load.
Date Thu, 19 Oct 2017 00:07:42 GMT
Alex Behm has posted comments on this change. ( )

Change subject: IMPALA-6070: Parallel data load.

Patch Set 1:


Changes like these tend to be slow and painful to test, so I'm in favor of not parallelizing
additional things in this patch. Additional steps can be improved later.
Commit Message:
PS1, Line 33: 
What testing did you do? Does the data load still run on a non-beefy local machine?
File testdata/bin/
PS1, Line 75:   HADOOP_HEAPSIZE="1024" hive --service hiveserver2 > ${LOGDIR}/hive-server2.out
2>&1 &
> This looks like it will also increase HADOOP_HEAPSIZE when not doing a para
I'd prefer to keep this change. Our Hive server tends to OOM pretty easily when doing anything
non-trivial with Hive on our mini cluster.

To view, visit
To unsubscribe, visit

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I836c4e1586f229621c102c4f4ba22ce7224ab9ac
Gerrit-Change-Number: 8320
Gerrit-PatchSet: 1
Gerrit-Owner: Philip Zeyliger <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Joe McDonnell <>
Gerrit-Reviewer: Philip Zeyliger <>
Gerrit-Reviewer: Zach Amsden <>
Gerrit-Comment-Date: Thu, 19 Oct 2017 00:07:42 +0000
Gerrit-HasComments: Yes

  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message