impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4678: move query MemTracker into QueryState
Date Thu, 12 Jan 2017 23:12:22 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4678: move query MemTracker into QueryState
......................................................................


Patch Set 4:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 56: #include "runtime/fragment-instance-state.h"
> there's no requirement or expectation that these are alphabetical
clang-format did this for me. The google style guide also says to do it:

https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes


Line 466:   query_state_.Reset(ExecEnv::GetInstance()->query_exec_mgr()->GetOrCreateQueryState(
> there's not need for 'get' semantics here
That's true - from this callsite it always goes down the "Create" code path. We need the generality
for other fragments though and it didn't seem worth having two separate implementations.


Line 485:     if (coord_instance_ == nullptr) {
> leave comment that this is a startup failure scenario
Done - didn't mean to remove that.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 43: #include "scheduling/query-schedule.h"
> did you move these?
clang-format sorts includes - I added query-state.h so this happened when I ran clang-format
on the diff. 

This is consistent with the Google style guide: https://google.github.io/styleguide/cppguide.html#Names_and_Order_of_Includes


Line 295:   QueryState::ScopedRef query_state_;
> i'm not sure this is a good idea. we don't want this to be managed by d'tor
Done


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

Line 193:   scoped_ptr<RuntimeState> runtime_state_;
> why?
We need to call the RuntimeState() constructor after setting up the ExecEnv, because it hooks
up the RuntimeState's MemTracker to things in the ExecEnv.

Previously this wasn't an issue because InitMemTrackers() was split out from the RuntimeState
constructor.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/mem-tracker.h
File be/src/runtime/mem-tracker.h:

Line 420:   /// Construct a MemTracker object for query 'id'. The query limits are determined
based
> this is not a lookup function, make it a static function in MemTracker itse
If I rewrite it as a static function, it would end up being a static function with MemTrackerRegistry*
as its first argument (because it needs to obtain the pool MemTracker from there).  Seemed
cleaner just to have it as a member function.

This also has the advantage that one class is responsible for setting up the top two levels
of the hierarchy correctly, rather than having responsibility split between two classes.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 63:   QueryState* GetOrCreateQueryState(
> are the 'get' semantics really needed?
We need it to behave in this way, since outside of the coordinator we don't know which fragment
instance will start first. I could rename it CreateQueryState() - it looks like in the codebase
we use both naming conventions for this kind of function.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 69:   query_mem_tracker_->UnregisterFromParent();
> we want to get away from doing anything meaningful in the d'tors.
It's also called TearDown() in Coordinator, to add another option. I called it ReleaseResources()
to be consistent with RuntimeState.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 75:     /// Releases the previous query state (if it was non-NULL) and replaces it with
> is this really needed, rather than just creating a new one?
I ended up removing this, since I removed the use case in the coordinator.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS4, Line 380: this
> remove
Done


Line 389:   void InitMemTrackers();
> move above comment
This was leftover from an earlier iteration -it's not needed.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/runtime/test-env.h
File be/src/runtime/test-env.h:

Line 48:       const TQueryOptions* query_options, QueryState** query_state,
> superfluous; you can get the qs from the runtimestate.
Done


Line 53:   void TearDownStates();
> or simply TearDown()?
I wanted the name to reflect that it didn't tear down TestEnv - just the things owned by TestEnv
that were created since the last TearDownStates() call.

Renamed to TearDownQueries() - that may be clearer about what it's meant to simulate.


Line 83:   std::vector<std::unique_ptr<FragmentInstanceState>> fragment_instance_states_;
> these should be owned by the querystates
I changed this to just stick them in the QueryState object pool instead. We don't need this
vector for anything.


Line 86:   std::vector<std::unique_ptr<RuntimeState>> runtime_states_;
> these should be owned by the instancestates
I also added this to the QueryState object pool. Simulating the normal ownership of FragmentInstanceState->PlanFragmentExecutor->RuntimeState
would be pretty invasive.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/service/fe-support.cc
File be/src/service/fe-support.cc:

Line 100:   RuntimeState state(query_ctx, ExecEnv::GetInstance(), "fe-eval-exprs");
> does the pool name signify anything? if not, leave at ""?
It shows up in "Memory limit exceeded" output - I just gave it a name so it would be easier
to figure out where the memory was coming from.

Added a comment to briefly explain.


http://gerrit.cloudera.org:8080/#/c/5630/4/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

Line 1055:     DCHECK(coord_->query_state()->query_mem_tracker() != NULL);
> you could also add a Coordinator::query_mem_tracker() convenience function
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5630
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id6b46652932b5638993623e98d1f0d60d8380ba0
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message