impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4014: Introduce query-wide execution state.
Date Mon, 07 Nov 2016 03:19:54 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4014: Introduce query-wide execution state.
......................................................................


Patch Set 2:

(27 comments)

Regarding tests: I ran a bunch of queries locally, and basic cases work, but I seem to get
some hbase-related crashes.

However, I don't think that the fixes will change the new functionality here in a meaningful
way, and given the size of this patch I'd like to get the review rolling now.

http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 27: 
> Didn't we want to move away from implicit destruction? Why is it ok to rely
we do, and it's not okay. i just didn't want to change too much in one go. updated commit
msg.


Line 31: class TPlanFragmentInstanceCtx;
> No ReportProfile() in here. Wouldn't the reporting be controlled by the Que
i'll remove this from the headers for now. i haven't spent much time looking at this code.


PS1, Line 32: 
            : class TUniqueId;
> Possibly a good time to retire that functionality which I don't think is ev
Done


PS1, Line 43: this fragment instance and closes all data streams.
            : ///
> Why can't we do that in this patch (given it's header only...)?
see above. i think it's too much to bite off in this set of changes.

updated commit msg.


Line 52:  public:
> What if Prepare() fails?
left todo, better addressed in the actual implementation.


PS1, Line 56: /// Frees up all resources allocated in Exec().
> When would we want to get a mutable query_state() from its FIS? That circum
we might pass the fis* into some function, and that function might need access the qs*. we
could always pass around qs* and fis*, but this is more convenient.


PS1, Line 69: 
            :   QueryState* query_state() { return query_state_; }
> I do find this API weird, always have. Why doesn't this object start its ow
see updated commit msg.


Line 94:   /// if set to anything other than OK, execution has terminated w/ an error
> Given my comment below, should this be Exec() or similar?
Done


Line 100:   /// Update 'exec_status_' w/ 'status', if the former is not already an error.
> What's this call for? Don't all fragment instances end in a sink? I know th
see updated commit msg.


Line 120
> When is this token acquired exactly?
see updated commit msg.


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

PS1, Line 24: 
            : #include "common/status.h"
> I don't think this is relevant - we shouldn't comment on what might call th
Done


PS1, Line 33: TExecPlanFragmentParams;
            : class TUniqueId;
> Not clear from this who owns that reference, and who is therefore responsib
let me know if this is better.


PS1, Line 43: ass QueryExecMgr {
            :  public:
> Why is this here, and not accessible through GetQueryState(query_id)->Cance
Done


PS1, Line 55: StartFInstanc
> Again, don't see that this needs to be in this class.
Done


http://gerrit.cloudera.org:8080/#/c/4418/1/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

Line 24: 
> This first paragraph could use some cleanup, e.g., grammar of the first sen
moved a fragment of the sentence.

there is a comparison/contrasting with fis because the classes are related.


Line 30: namespace impala {
> This sounds much like a shared_ptr, i.e., implicit management of the lifeti
this is indeed like a shared_ptr in the sense that it's refcounted. the difference is that
tear-down doesn't happen implicitly in the d'tor.

an rpc that arrives after the refcount goes to 0 is a no-op. if that would be an error (ie,
if that were truly a race), we need to prevent that from happening in some other way, such
as overhauling our rpc protocol.

in your example, data arriving after the final query result has been computed to me sounds
like a no-op.


PS1, Line 31: 
> too specific: clients of this class must obtain a reference.
i don't know what you mean by 'clients of this class'. from a correctness perspective it is
important that each thread that accesses query state of any kind has a reference to the qs
for at least the duration of that access.


PS1, Line 41: / Any thread 
> comment on usage pattern, which is going to be common. Particularly that ca
changed to ScopedRef, expanded comment


Line 41: /// Any thread that executes on behalf of a query, and accesses any of its state,
> ScopedQueryState? This makes it clear that the access is scoped, and we alr
then we have declarations like 'QueryState::ScopedQueryState qs(...);', which is pretty awkward.


PS1, Line 44: cMgr::Get-/ReleaseQueryStat
> shouldn't there be corresponding changes to ExecEnv then? Or are you leavin
yes, leaving out changes to existing classes for now.


PS1, Line 49: 
> I don't think we need this, but presumably it's redundant here if the const
Done


PS1, Line 60: 
> nit picking here, but it might be helpful to distinguish the query lifetime
i'm not sure it makes sense to spend too much time polishing these headers for code that hasn't
been written yet.

the purpose of this object pool is really for query-duration objects. for instance-duration
objects we need an object pool in the fis, i'd think.


PS1, Line 65: 
> Would just say it releases all resources owned by this class ("accessible t
i actually meant exactly that (example: something sitting in RuntimeState). but "release resources"
is more general.


Line 66:     /// may return nullptr
> When is this legal to call with respect to the reference count? Will this c
the point of reference counting is to have all accesses be legal unless the refcount is 0
(in which case you can't access the qs anymore). in other words, it's always legal to call
this.


PS1, Line 73:   DISA
> Under what conditions would this return an error?
clarified.


Line 76:   /// a shared pool for all objects that have query lifetime
> Isn't a filter really published to a plan node contained in all relevant fr
good point, they are indeed shared across the entire query, as is the filter bank.


Line 78: 
> We'll deal with local filter aggregation separately correct?
yes, that requires new logic/new classes (an instance of which will live in the qs).


-- 
To view, visit http://gerrit.cloudera.org:8080/4418
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I962ae6b7cb7dc0d07fbb8f70317aeb01d88d400b
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Henry Robinson <henry@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message