impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3385: Fix crashes on accessing error_log
Date Fri, 22 Apr 2016 01:59:03 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3385: Fix crashes on accessing error_log
......................................................................


Patch Set 5:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/2829/5/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

Line 223: }
Returning a copy is kind of surprising. How about making the interface mirror GetUnreportedErrors().
I.e. replace this with a:

void GetErrors(ErrorLogMap* errors);

That way, it's clear to readers of the coordinator code that a copy is happening.  Also, we
should just get rid of the ErrorLogIsEmpty() method -- merging an empty map is fine.


Line 242:   // errors multiple times.
This comment doesn't make sense, as far as I see.  What goes wrong if we just do error_log_.clear()?
 I think we can just clear the error_log_ map. It will affect ErrorCount() but that seems
okay (and current semantics of ErrorCount() are weird anyway -- general errors decrease but
non-general do not given the current code).


http://gerrit.cloudera.org:8080/#/c/2829/5/be/src/util/error-util.cc
File be/src/util/error-util.cc:

Line 135:     if (v.second.count == 0) continue;
please also test that you don't break ErrorLog functionality.  For example, looking at AppendError(),
we don't maintain the count for GENERAL errorcode, so it looks like this change would break
the GENERAL case (we'll always skip it).


Line 140:     } else {
DCHECK_GT(v.second.messages, 0);
DCHECK(!v.second.messages.empty());


Line 142:       if (v.second.count < 2) {
if (v.second.count == 1) {


Line 159:     if (v.second.count == 0) continue;
remove - this doesn't work for GENERAL. see above.


Line 161:     if (v.first == TErrorCode::GENERAL) {
DCHECK_EQ(v.second.count, 0);


Line 167:         (*left)[v.first].count += v.second.count;
this is still broken if 'left' was cleared.

I think what we should do is maintain the invariant:
for GENERAL: count == 0, messages.size() > 0
otherwise: count > 0, messages.size() == 1

and then add DCHECKs to these routines to check those invariants (as noted by other comments),
and also fix GetUnreportedErrors() to just clear the map.


Line 168:       } else {
DCHECK_GT(v.second.count, 0);
DCHECK(!v.second.messages.empty());


Line 181:     if (it != map->end()) {
DCHECK(!it->second.messages.empty());


Line 193:       errors.find(TErrorCode::GENERAL)->second.messages.size() - 1 : 0;
this code looks wrong -- messages.size() can be 0, so general_errors will underflow. Anyway,
this will be fixed if we just clear the map in GetUnreporterErrors() so that the invariants
above are held.


-- 
To view, visit http://gerrit.cloudera.org:8080/2829
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I3a7e3d22e26147ada780aae5aed1f2e25a515afc
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Huaisi Xu <hxu@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Huaisi Xu <hxu@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message