impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5750: Catch exceptions from boost thread creation
Date Thu, 31 Aug 2017 23:20:58 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5750: Catch exceptions from boost thread creation
......................................................................


Patch Set 7:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/hdfs-scan-node.cc
File be/src/exec/hdfs-scan-node.cc:

PS7, Line 371: COUNTER_ADD(num_scanner_threads_started_counter_, 1);
The problem with moving this here is when the node is under heavy stress, there may be a window
where the scanner thread starts running, but this thread never gets scheduled for a while
causing the counter to be slightly misleading.

Not that it's a huge problem, but I just wanted to point it out.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/exec/kudu-scan-node.cc
File be/src/exec/kudu-scan-node.cc:

PS7, Line 171: ++num_active_scanners_;
This can cause quite a few races. It can race with L220, L244 and L245.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS7, Line 335:     // Fragment instance successfully started
             :     // update fis_map_
             :     fis_map_.emplace(fis->instance_id(), fis);
             :     // update fragment_map_
             :     vector<FragmentInstanceState*>& fis_list = fragment_map_[instance_ctx.fragment_idx];
             :     fis_list.push_back(fis);
Is it safe to update the map with the Fragment instance state after already starting the fragment
instance?

I tried going through some scenarios, and they all checked out fine since critical RPCs like
Cancel() are protected by 'instances_prepared_promise_', but I'm not sure if I'm missing some
other failure case.


http://gerrit.cloudera.org:8080/#/c/7730/7/be/src/util/thread.cc
File be/src/util/thread.cc:

PS7, Line 303: rand()
Not to be pedantic, but a rand() without seeding the PRNG first, will cause the same series
of numbers to be generated on a particular node for multiple different runs. Causing the failure
injections to be fairly deterministic if we're running the same queries over these test runs.

Some thing closer to actual pseudo-randomness would require something like the following:
https://github.com/apache/incubator-impala/blob/master/be/src/rpc/authentication.cc#L493

But getting a new random device every time could be expensive.

Do you know of a better but cheap way to do this?


-- 
To view, visit http://gerrit.cloudera.org:8080/7730
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I15a2f278dc71892b7fec09593f81b1a57ab725c0
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message