impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3610: Account for memory used by filters in the coordinator
Date Sat, 27 Aug 2016 00:48:43 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3610: Account for memory used by filters in the coordinator
......................................................................


Patch Set 2:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/4066/2//COMMIT_MSG
Commit Message:

Line 25: the output broadcast structure without copying.
> please add this explanation/invariant (which is very helpful for understand
Added explanation to the FilterState->bloom_filter member.


http://gerrit.cloudera.org:8080/#/c/4066/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 431:   filter_mem_tracker_.reset(new MemTracker(-1, -1, "Runtime Filter (Coordinator)",
> move into c'tor, this is setup that can be done prior to execution.
The query_mem_tracker() is set only by L425 based on whether there exists a runtime state
or not. So it needs to be done here.


Line 1965:   unordered_set<int32_t> target_fragment_instance_idxs;
> move to where it's used
If it's moved inside the curly braces, it will lose scope in L2051.


Line 1979:     // Check if the filter has already been sent, which could happen in three cases:
if
> turn into bullet points, easier to read
Done


Line 1995:     --state->pending_count;
> turn this block into a member fn for state (ApplyUpdate?), that'll make it 
Done


Line 2006:           // Discard as one missing update means a correct filter cannot be produced.
> discard,
Done


Line 2007:           state->disabled = true;
> move into discard
'disabled' is just used to keep track of the cases where a filter is not used due to OOM.
A filter will be discarded when it is complete, but that doesn't mean that it hit OOM.


Line 2013:             // TODO: Implement BloomFilter::Or(const ThriftBloomFilter&)
> if we already have problems with memory spikes, shouldn't this be addressed
Done


Line 2026:     for (const auto& target: state->targets) {
> remove auto unless it's an stl iterator
Done


Line 2035:       // move the payload from the request rather than copy it and take double
the memory
> provide that information for the data structure in the .h file
Done


Line 2040:       swap(rpc_params->bloom_filter, *filter);
> why not do this for FilterState in the block above, rather than just the rp
Done.
We cannot however, do a swap() for the first oncoming non-broadcast filter because the state->bloom_filter
will be NULL.


Line 2051:   for (const auto& target_idx: target_fragment_instance_idxs) {
> don't use auto here, this isn't an iterator
Done


Line 2054:     auto cb = [rpc_params, addr = fragment_inst->impalad_address(),
> bind was more compact; please revert
Done


http://gerrit.cloudera.org:8080/#/c/4066/2/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 195:   void Done();
> TearDown? done is very generic.
Done


Line 399:     std::unique_ptr<BloomFilter> bloom_filter;
> is this why you're including the filter header?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4066
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message