impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Philip Zeyliger (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5940. Avoid log spew by using Status::Expected.
Date Tue, 19 Sep 2017 18:16:50 GMT
Philip Zeyliger has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/8100

Change subject: IMPALA-5940. Avoid log spew by using Status::Expected.
......................................................................

IMPALA-5940. Avoid log spew by using Status::Expected.

In IMPALA-5926, we fixed a case where closing the session triggered a
stack trace in the logs which impacted performance for short-running
queries. Looking at log files from several active clusters, I identified a few
other cases where we could clean up log spew with the same (trivial)
approach.

In practice, the expected messages here are saying that we don't support
codegen for the given file formats or data types. Because codegen
happens at every execution node, these messages are very common in the
log files.

The snippet I used to identify these was:

find . -type f -name '*IMPALAD*.gz' | xargs gzcat  | awk '/^I/ { if(x) { print x; } x = ""
} /status.cc/ { x=" "; } { if(x) { x=x  $0 } }'  | sed -e 's/0x[0-9a-fx]* //g' | sed -e 's/[0-9a-f]\{16\}:[0-9a-f]*/QUERYID/g'
|  tr -s '\t' ' ' | tr '[0-9]' 'N' | sort | uniq -c  | sort -n | tee output.txt

I also analyzed some logs using SQL, against a pre-processed logs table:

  with v as (
    select regexp_replace(
        regexp_replace(
          translate(substr(message, 42), "\n\t", "  "),
          "[a-zA-Z0-9.-]*[.][a-zA-Z0-9-]*:[0-9]*",
          "<host>"),
        "@.*$", "@@@...") as m
    from logs_table where `class`="status.cc")
  select m, count(*) from v group by 1 order by 2 desc limit 100

Testing:
* Automated tests.

Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330
---
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/runtime/tuple.cc
M be/src/util/tuple-row-compare.cc
5 files changed, 7 insertions(+), 7 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/8100/1
-- 
To view, visit http://gerrit.cloudera.org:8080/8100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I38088482377a1c3e794a9c8178ef83f29957a330
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Philip Zeyliger <philip@cloudera.com>

Mime
View raw message