impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthew Jacobs (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4787: Optimize APPX MEDIAN() memory usage
Date Tue, 21 Feb 2017 22:50:18 GMT
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage

Patch Set 6:


I'm still not convinced the new paths are necessarily, e.g. the case I mentioned previously
where you need to merge and there are different sized inputs.
File be/src/exprs/

PS6, Line 918: const static int INIT_CAPACITY = 16;
             : const static int MAX_NUM_SAMPLES = NUM_BUCKETS * NUM_SAMPLES_PER_BUCKET;
Please add a comment. It's not clear why all of these are grouped together anymore. The first
two are only relevant to histograms. These two are about capacity for anything using ResSampling.
MAX_NUM_SAMPLES should probably also be a capacity now, e.g. MAX_CAPACITY.

PS6, Line 958: ReservoirSampleState
after thinking more about the casing comments, I came to the conclusion that I don't think
trying to make this look like a std::vector is even the best interface.

I'd prefer if the methods were named non-std-vector-like names, e.g. 


PS6, Line 1285:   nth_element(src->begin(), mid_point, src->end(), SampleValLess<T>);

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <>
Gerrit-Reviewer: Alex Behm <>
Gerrit-Reviewer: Jim Apple <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Reviewer: Taras Bobrovytsky <>
Gerrit-HasComments: Yes

View raw message