impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-4856: Port ImpalaInternalService to KRPC
Date Tue, 18 Apr 2017 06:26:37 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4856: Port ImpalaInternalService to KRPC

Patch Set 3:


Had a first look.
File be/src/runtime/

PS3, Line 327: VLOG_QUERY << "DataStreamMgr maintenance tasks complete. Took: "
             :                << PrettyPrinter::Print(MonotonicMillis() - start, TUnit::TIME_MS);
Do you think it's worth printing what maintenance tasks were done in this iteration? And maybe
don't print anything if no work was done. Might help with debugging. Feel free to ignore if
you disagree.

Also, wondering if VLOG_QUERY is the right log level to print this. Wouldn't this cause a
lot of spam?
File be/src/runtime/data-stream-mgr.h:

PS3, Line 135: idea
I would rather use the word 'assumption' here.

I think there needs to be some comment clearly distinguishing between this timeout and FLAGS_datastream_sender_timeout_ms.
File be/src/runtime/

PS3, Line 129: pending_senders_
Maybe this could be future work, but I foresee a need to cap this at some number (which could
be a function of the number of nodes) after which it would be beneficial to just fail the

PS3, Line 137: SpinLock
Isn't this a rather large amount of work to have under a spinlock?
Also, we would expect this lock to be more than fairly contended.

PS3, Line 191: // num_remaining_senders_ could be 0 because an AddBatch() can arrive *after*
             :     // EndDataStream() RPC for the same sender, due to asynchrony on the sender
side (the
             :     // sender gets closed or cancelled, but doesn't wait for the oustanding
             :     // to complete before trying to close the channel).
If this is the only case where num_remaining_senders_ can be 0, then is there any point in
responding at all?
File be/src/runtime/data-stream-recvr.h:

PS3, Line 174: good_bytes_received_counter_
bytes_accepted_counter_ ?
File be/src/service/

Line 80:     ExecEnv::GetInstance()->stream_mgr()->AddData(finst_id, move(payload));
This is a tad bit confusing. Why is there not a necessity to have a RespondSuccess() for this
I do see that there is a payload.context->RespondSuccess() inside AddData(), but that's
only on the error case.

Also, ideally, for the sake of consistency, I would prefer that the error status be returned
here and then see that the RPC is responded to from here. Unless you see a good reason for
it not to be that way.

PS3, Line 99: // TODO: Check return
What he said.

Line 118:     context->GetInboundSidecar(filter.header.directory_sidecar_idx(), &;
Same here, check return status.

Line 155:   context->RespondSuccess();
No need to return status.SetTStatus(&return_val) ?
I see that we check it in ReportStatusCb().

This changes behavior a bit. It looks like currently, if a ReportExecStatus() RPC fails, we
fail the query. That won't happen if the sidecar deserialization fails.
File be/src/service/

PS3, Line 1925: move
Include what you use? <utility>
File be/src/service/impala-server.h:

PS3, Line 123: Shutdown
Who calls this?
File be/src/service/

PS3, Line 79: exec_env->Init();
Need to check returned status and fail if necessary.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I95229290566a8ccffd80ed2d74c1c57cf1479238
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Henry Robinson <>
Gerrit-Reviewer: Anonymous Coward #168
Gerrit-Reviewer: Henry Robinson <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-HasComments: Yes

View raw message