impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-2979: Fix scheduling on remote hosts
Date Thu, 21 Jul 2016 14:28:05 GMT
Lars Volker has posted comments on this change.

Change subject: IMPALA-2979: Fix scheduling on remote hosts

Patch Set 27:


Thanks for the review, please see PS28.
File be/src/scheduling/

Line 439:               backend_candidates.push_back(backend_ip);
> could you add a todo to that effect to the .h?
We had one there already, I changed it to be more specific.

Line 439:               backend_candidates.push_back(backend_ip);
> yes, let's reevaluate the scheduling strategy  based on more extensive test

Line 440:             } else if (memory_distance == min_distance) {
> ?
You're right, done.

Line 509:   // (the root fragment doesn't have a destination)
> is that not true?
It is, I thought it doesn't help much to comment on this here. I added it back.

PS24, Line 1089:   total_assignments_->
> why does moving the seed generator into the main scheduler class cause a pr
It won't cause a problem to move the seed generator there, as long as the seeds are generated
deterministically (which is what rand() does). Eventually we might want to factor out and
inject all global dependencies of the scheduler so it becomes more testable. Should we move
the call to rand() outside and pass a seed (int) here? Or pass in a functor to generate the
random numbers for shuffle? Or keep it as is for now? I'm good with either, none of them will
make a difference wrt testability.

Line 1090:     if (!remote_read) total_local_assignments_->Increment(1);
> let's do it as a follow-on (but let's do that right after this gets in).
Added a class level todo.
File be/src/scheduling/

Line 916:   BackendList* be_descs = &backend_map_[be_desc.ip_address];
> i'm not a fan of using [] for reading, because that has the potential also 

Line 1100:   // Explicitly set the optional fields.
> why?
The comment didn't make sense, I removed it.
File be/src/scheduling/simple-scheduler.h:

Line 174:   /// AssignmentHeap it can be used to look up heap elements by their IP address
> i guess it's good to leave here.

Line 327: 
> is this the local backend? local_backend_descriptor_?
File be/src/scheduling/simple-scheduler.h:

Line 140:   /// Internal structure to track scan range assignments for a backend. This struct
> ah, so this is really for a backend host, not just backend.

Line 141:   /// used as the heap element in AddressableAssignmentHeap.
> "heap element in and maintained by "? (or is this updated by anyone else?)

Line 148:     int random_rank;
> you can also make this const int and then provide a c'tor that sets the ran
Good idea, aggregate initialization makes this possible without a c'tor even.

Line 247:     /// member in AssignmentCtx.
> streamline comment

Line 428:   /// Finally scan ranges are considered which do not have an impalad backend running
> sorry, i think it's actually "finally, ..."

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I044f83806fcde820fcb38047cf6b8e780d803858
Gerrit-PatchSet: 27
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Lars Volker <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Marcel Kornacker <>
Gerrit-Reviewer: Matthew Jacobs <>
Gerrit-Reviewer: Mostafa Mokhtar <>
Gerrit-Reviewer: Sailesh Mukil <>
Gerrit-Reviewer: anujphadke <>
Gerrit-HasComments: Yes

View raw message