impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Marcel Kornacker (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-2979: Fix scheduling on remote hosts
Date Sun, 17 Jul 2016 01:11:06 GMT
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2979: Fix scheduling on remote hosts

Patch Set 25:


the header file looks good structurally
File be/src/scheduling/simple-scheduler.h:

Line 380:       const TResourceBrokerReservationResponse& reservation, Coordinator* coord);
> Done. Being a class member method it will also affect member variables of t
the comments needs to describe the externally visible behavior, which includes side effects.
as part of that, you should point out which class variables get updated (unless the answer
is 'most of them').
File be/src/scheduling/simple-scheduler.h:

Line 91:   /// List of backends.
no need to repeat/paraphrase type name

Line 112:     /// Construct maps from list of backends.

Line 137:   /// Internal structure to track scan range assignments for each backend and prioritize
"for a backend". remove comment about prioritization, this structure doesn't do the prioritization.

Line 141:     /// The number of bytes assigned to a backend host. Part of the key of
backend or backend host?

Line 142:     /// AddressableAssignmentHeap. This gets initialized when elements are inserted
don't comment on other data structures here

Line 164:   /// will make the top() element of the heap be the backend with the least number
"lowest number"

Line 174:   /// update their key. For each plan node we create a new heap, so they are not
the last sentence should go into the structure that contains an object of this class

Line 199:   /// member in AssignmentCtx.
move into AssignmentCtx? is it used outside of that?

Line 222:     /// data_locations:     List of candidate replica IP addresses, from which the
redundant w/ earlier part of comment

Line 224:     /// break_ties_by_rank: Whether to consider the random rank during backend selection.
same here

Line 229:     /// those will be preferred. Otherwise one is selected from the assignment heap.
describe the externally observable behavior (what are the characteristics of the selected
backend), not the implementation ("from assignment heap")

Line 233:     /// pointer. This assumes that a returned backend will also be assigned to.
The caller
"increment the internal pointer" isn't meaningful

Line 242:         const TScanRangeLocations& scan_range_locations,
role of this param?

Line 252:     /// Used to lookup hostnames to IP addresses and IP addresses to backend.
use verb, not noun

Line 291:   /// The scheduler's backend maps. When receiving changes to the backend configuration
replace use of term 'backend maps' throughout

Line 292:   /// from the statestore we will make a copy of those maps, apply the updates to
the copy
this is fairly central to how this class functions, and should be pointed out in the class

Line 424:   /// assigned work is picked, thus aiming to distribute the work as even as possible.
as evenly

Line 426:   /// Finally scan ranges are considered, which do not have an impalad backend running
remove comma

Line 446:   ///   amount of assigned work), then the first one will be picked deterministically.
only true for disk replicas, no?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I044f83806fcde820fcb38047cf6b8e780d803858
Gerrit-PatchSet: 25
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-Reviewer: anujphadke <>
Gerrit-HasComments: Yes

View raw message