impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3610: Account for memory used by filters in the coordinator
Date Fri, 22 Jul 2016 06:47:17 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-3610: Account for memory used by filters in the coordinator
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/3618/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS2, Line 2110:           state->bloom_filter->Or(BloomFilter(params.bloom_filter));
              :           filter_mem_tracker_->Release(heap_space);
              :         }
              :         if (--state->pending_count > 0) return;
              :       }
              :     }
              : 
> Are you worried that this is an expensive loop? Otherwise I think it's bett
Yes, I was worried that it could be an expensive loop in certain cases. But if we don't see
it being a potential bottleneck, it's obviously better to leave accessing this shared state
under a lock.


http://gerrit.cloudera.org:8080/#/c/3618/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS3, Line 2107: state->bloom_filter.reset(new BloomFilter(params.bloom_filter));
Thinking about this more, why do we even need to do this for a broadcast join filter? We don't
modify the filter here at all, so why go through the trouble of trying to reserve memory and
create a BloomFilter object, when all we're going to do is convert it back to a thrift struct
and publish it?

Couldn't we just publish the same thrift structure that came in?
Sure, this will leave state->bloom_filter as NULL for that filter ID, but no one ever tries
to use it.


PS3, Line 2124: if (target.is_local) continue;
Is this a bug? What if it's a partitioned join filter and it's local to one producer but not
to another? This wouldn't send the filter to the intended recipients then.

Looking at the following code that sets 'is_local':
https://github.com/cloudera/Impala/blob/92fbe7bcd79d68e3ec12d767b2c4f079d5c1ab54/fe/src/main/java/com/cloudera/impala/planner/RuntimeFilterGenerator.java#L367-L374

For a partitioned join filter, if the last target in that loop has isLocal as 'true', then
this filter wouldn't be sent to any target.


-- 
To view, visit http://gerrit.cloudera.org:8080/3618
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3c52c8a1c2e79ef370c77bf264885fc859678d1b
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message