impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Volker (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5890: Abort queries if scanner hits IO errors
Date Mon, 11 Sep 2017 21:46:07 GMT
Lars Volker has posted comments on this change.

Change subject: IMPALA-5890: Abort queries if scanner hits IO errors
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/base-sequence-scanner.cc
File be/src/exec/base-sequence-scanner.cc:

PS5, Line 177:     if (status.IsCancelled() || status.IsMemLimitExceeded()) return status;
             : 
             :     // Log error from file format parsing.
             :     state_->LogError(ErrorMsg(TErrorCode::SEQUENCE_SCANNER_PARSE_ERROR,
             :         stream_->filename(), stream_->file_offset(),
             :         (stream_->eof() ? "(EOF)" : "")));
             : 
             :     // Make sure errors specified in the status are logged as well
             :     state_->LogError(status.msg());
             : 
             :     // If abort on error then return, otherwise try to recover.
             :     if (state_->abort_on_error()) return status;
             : 
             :     // Abort scan range for I/O related errors
             :     if (status.code() == TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR) {
             :       eos_ = true;
             :       return Status::OK();
             :     }
> it does seem like this could be written in terms of RuntimeState::LogOrRetu
It looks to me as if the query will be aborted if return a non-OK status here. ProcessSplit()
will return that status and it will be handled in hdfs-scan-node.cc:441. This will call SetDoneInternal()
and the scanners will stop. Setting eos_ here will make stop ProcessSplit() but make it return
Status::OK().

    if (!status.ok()) {
      unique_lock<mutex> l(lock_);
      // If there was already an error, the main thread will do the cleanup
      if (!status_.ok()) break;

      if (status.IsCancelled() && done_) {
        // Scan node initiated scanner thread cancellation.  No need to do anything.
        break;
      }
      // Set status_ before calling SetDone() (which shuts down the RowBatchQueue),
      // to ensure that GetNextInternal() notices the error status.
      status_ = status;
      SetDoneInternal();
      break;
    }

That being said, I changed the code to abort the query.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.cc
File be/src/exec/scanner-context.cc:

PS5, Line 194: Internal "
> would be best to keep "Internal error" on the same line so that it's more e
Done


Line 284:       io_buffer_bytes_left_ = 0;
> Maybe add a short comment explaining why this is necessary?
Done, thx for the suggestion.


Line 371:   Status wrapped_status(TErrorCode::SCANNER_CONTEXT_WRAPPED_IO_ERROR, status.msg().msg());
> Maybe reference IMPALA-4697 for context on why we're doing this instead of 
Removed the code altogether.


http://gerrit.cloudera.org:8080/#/c/8011/5/be/src/exec/scanner-context.h
File be/src/exec/scanner-context.h:

Line 277:     Status WrapIoError(const Status& status);
> Is there a simple thing we can start with for this patch (rather than intro
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/8011
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I44dc95184c241fbcdbdbebad54339530680d3509
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message