impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] PREVIEW: IMPALA-2550 Introduce query-wide execution context.
Date Fri, 12 Aug 2016 20:26:19 GMT
Sailesh Mukil has posted comments on this change.

Change subject: PREVIEW: IMPALA-2550 Introduce query-wide execution context.
......................................................................


Patch Set 5:

(30 comments)

http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 418: // TODO: Should we create a QueryExecState for the coordinator fragment?
> yes, otherwise you wind up with special-case logic to handle the paths wher
I thought the coordinator does not have a QEM (or previously a FragmentMgr). Because we never
create a FragmentExecState for the coordinator fragment and just start off with a PlanFragmentExecutor
directly. And the coordinator code ends up doing what a FragmentExecState would have done
for a fragment otherwise (like Prepare(), Exec(), Close(), etc.)


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/query-exec-state.cc
File be/src/runtime/query-exec-state.cc:

PS4, Line 71:   lock_guard<SpinLock> l(fragment_exec_state_map_lock_);
> This is confusing: presumably this exists to ensure that *this has a lifeti
That makes sense. I previously wanted only a single interface to increase/decrease the ref
counter, but I see the problem with it now.
I've made FindOrInsertQES() increase the ref count.


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/query-exec-state.h
File be/src/runtime/query-exec-state.h:

Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> Fix license headers here and elsewhere.
Done


Line 36: /// Backend execution state of a query, which is shared between all fragment instances
> No need to talk about what owners of this class might do with it.
Done


PS4, Line 41: or f
> nit: use nullptr now.
Done


PS4, Line 84: 
> Why shared?
That was a mistake. Changed it to a raw pointer.


PS4, Line 101: rence
> std
As we spoke, I'll make sure that the decided on patch compiles with 'boost' and change it
to 'std' in the end (as changing to 'std' causes some compilation issues I don't really understand
now).


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 274:   /// A pointer to the query wide execution state which holds all the shared state
between
> comment what this is.
Done


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/client-request-exec-state.cc
File be/src/service/client-request-exec-state.cc:

Line 14: 
> add new line before this one
Done


Line 570:   wait_thread_.reset(new Thread("query-exec-state", "wait-thread",
> long line
Done


Line 677: 
> long line
Done


Line 829: }
> long line
Done. Just FYI, there's quite some more to fix in this file, but haven't done it yet as it's
not relevant to the core patch. Will do it once the header is decided on.


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/query-exec-mgr.cc
File be/src/service/query-exec-mgr.cc:

PS4, Line 37: 
> avoid bind, prefer lambdas
I've made a note of this, I will change it once I can get past all the compilation issues
after the header has been decided on.


PS4, Line 48:             << " fragment instance#="
            :              << exec_params.fragment_instance_ctx.instance_state_idx;
            : 
            :   // Remote fragments must always have a sink. Remove when IMPALA-2905 is resolved.
            :   DCHECK(exec_params.fragment_ctx.fragment.__isset.output_sink);
            : 
            :   // Create or retrieve the QueryExecState for this query id.
            :   QueryExecState* query_exec_state = FindOrInsertQueryExecState(exec_params.query_ctx);
            :   DCHECK(query_exec_state != NULL);
            : 
            :   // This is a no-op if another fragment of the same query already called Init().
            :   query_exec_state->Init();
            : 
> this logic looks like it should live inside RegisterFragmentInstance
Done


PS4, Line 65: if (!register_status.ok()) {
            :     ReleaseQueryExecState(query_exec_state);
            :     return register_status;
            :   }
            :   return Status::OK();
            : }
> Why not change the interface to the following:
I thought of that but GetQES() is used wherever we want to get the QES for a query.

So, this could result in a case where the QES is destroyed in a backend (after that backend
completes its execution for that query), and then a late CancelPlanFragment() arrives and
calls GetQES() which will end up setting up the QES for that already completed query (w.r.t
that backend) again.

I've made a change so that FindOrInsertQES() also increases the ref count, and calls ReleaseQES()
on a failure.


PS4, Line 103: lock_guard<SpinLock> l(query_exec_sta
> ++
Done


PS4, Line 107:     unique_ptr<QueryExecState> query_exec_state(new QueryExecState(this,
query_ctx));
             :     i = query_exec_state_map_.insert(
             :         make_pair(query_id, std::move(query_exec_state)));
             :   }
             :   ++(i->second->num_current_references_);
             :   return i->second.get();
             : }
             : 
> A QES will never be destroyed unless at least one reference is ever taken t
I've made a change such that QEM->FindOrInsertQES() and QEM->GetQES() both increase
the ref count, and I also make sure to call ReleaseQES() on a failure in QEM->RegisterFragmentInstanceWithQuery().

So there is no case where a QES can be alive and dangling without impending destruction.


PS4, Line 133:    
> remove std::
Done


PS4, Line 137: scoped_qes.PublishFilter(dst_instance
> nit: you're doing the lookup twice. insert returns a pair <iterator, bool> 
Done


PS4, Line 142: 
> The right thing to do is to augment the rpc params with the query ID, so yo
Done


http://gerrit.cloudera.org:8080/#/c/3817/4/be/src/service/query-exec-mgr.h
File be/src/service/query-exec-mgr.h:

Line 1: // Licensed to the Apache Software Foundation (ASF) under one
> Replace header with ASF one.
Done


PS4, Line 29: space imp
> fragment
Done


PS4, Line 40: /// offer itself up for destruction when it is no longer being used and the
            : /// destroy_thread_ will destroy it.
            : ///
> Can you do that in this patch? (Already noticed one instance below). Gettin
Done. I changed CancelPlanFragment() and PublishFilter().


PS4, Line 61: 
            :   /// Return a pointer to the QueryExecState if
> This shouldn't have RPC-style signature (since this isn't a proxy class): r
Done


PS4, Line 80: eId& fragment_instance_id);
> When do we know the fragment instance id, but not the query id?
As we spoke everything coming in to a backend specifies only the fragment_id, i.e. TCancelPlanFragmentParams,
TPublishFilterParams, etc.

I've changed their thrift structures to include the query_id and gotten rid of this function
and the <query_id, fragment_id> map.

So now every lookup is from query_id->QES as you suggested.


Line 91:   /// The caller should always check if obj->is_valid_ref() is 'true' before using
it.
> Comment?
Done


PS4, Line 94:  only through the APIs provi
> This should take a query ID. You should try to get rid of the fragment->que
Done. (As mentioned in the comment above)


PS4, Line 98:       function signatures for every function the QES wants to expose.
> What prevents this from returning NULL? You need some mechanism to ensure t
I've added a comment above explaining the guarantees of this class.

Currently, if NULL is returned, there's nothing that can be done. It's the callers responsibility
to check (ScopedQueryExecStateRef.query_exec_state() != NULL) before using it.


PS4, Line 117: y_exe
> std
As we spoke, I'll make sure that the decided on patch compiles with 'boost' and change it
to 'std' in the end (as changing to 'std' causes some compilation issues I don't really understand
now).


PS4, Line 130: e_map_
> not just a single thread then.
This ThreadPool is initialized with one thread, so it's still just one thread.
I followed how it's done in data-stream-sender:
https://github.com/apache/incubator-impala/blob/master/be/src/runtime/data-stream-sender.cc#L79-L80

This way it's easy to use the Offer() API.

Please let me know if you think there's a better way to do it.


-- 
To view, visit http://gerrit.cloudera.org:8080/3817
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I892091d6401acb2ea91ccb1623af54c6f9635e6c
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message