impala-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikramjeet Vig (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (IMPALA-5275) Avoid printing Status stack trace on hot paths
Date Mon, 24 Jul 2017 19:10:00 GMT

     [ https://issues.apache.org/jira/browse/IMPALA-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Bikramjeet Vig resolved IMPALA-5275.
------------------------------------
    Resolution: Fixed


IMPALA-5275: Avoid printing status stack trace on hot paths

Currently, creation of a Status object (non-OK and non-EXPECTED)
prints the stack trace to the log. Fetching the stack trace takes
a large chunk of CPU time close to 130ms and results in a significant
perf hit when encountered on hot paths.
Five such hot paths were identified and the following changes were
made to fix it:

1. In ImpalaServer::GetExecSummary(), create Status() without holding
the query_log_lock_.
2, 3 and 4. In impala::DeserializeThriftMsg<>(),
PartitionedAggregationNode::CodegenUpdateTuple() and
HdfsScanner::CodegenWriteCompleteTuple, use Status::Expected where
appropriate.
5. In Status::MemLimitExceeded(), create Status object without
printing stacktrace

Change-Id: Ief083f558fba587381aa7fe8f99da279da02f1f2
Reviewed-on: http://gerrit.cloudera.org:8080/7449
Reviewed-by: Matthew Jacobs <mj@cloudera.com>
Tested-by: Impala Public Jenkins

> Avoid printing Status stack trace on hot paths
> ----------------------------------------------
>
>                 Key: IMPALA-5275
>                 URL: https://issues.apache.org/jira/browse/IMPALA-5275
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Mostafa Mokhtar
>            Assignee: Bikramjeet Vig
>            Priority: Critical
>              Labels: ramp-up
>         Attachments: cancellation_printing_expensive_stack_impalad-stacks-full.out.zip
>
>
> h3. Summary from [~mjacobs]:
> There are 3 known issues to fix ASAP:
> 1) In ImpalaServer::GetExecSummary(), do not create Status() while holding the lock.
> 2) Change error status in impala::DeserializeThriftMsg<>() to use Status::Expected
> 3) PartitionedAggregationNode::CodegenUpdateTuple() handling for unsupported types (e.g.
char) should use Status::Expected
> Then separately (less time critical) we should investigate how to improve our stack walking
as these issues may come up again in the future. I filed IMPALA-5651.
> h3. More detailed initial report for #2 above, from Mostafa:
> When reading Parquet data attempts to find the Parquet header prints a stack trace to
the log even if the error was found successfully.
> Also printing the stack trace without logging which Parquet file was being read makes
the log message useless. 
> {code}
> Status::Status(const string& error_msg)
>   : msg_(new ErrorMsg(TErrorCode::GENERAL, error_msg)) {
>   VLOG(1) << msg_->msg() << "\n" << GetStackTrace();
> }
> {code}
> Vtune data showed that capturing the stack trace is very expensive and spends 130ms CPU
per invocation.
> {code}
> This is the expensive call stack btw 
> CPU Time
> 1 of 8: 77.9% (0.106s of 0.136s)
> vmlinux ! KSTK_ESP - [unknown source file]
> vmlinux ! vm_is_stack + 0xa1 - [unknown source file]
> vmlinux ! show_map_vma + 0x268 - [unknown source file]
> vmlinux ! show_map + 0x20 - [unknown source file]
> vmlinux ! show_pid_map + 0x12 - [unknown source file]
> vmlinux ! seq_read + 0x24f - [unknown source file]
> vmlinux ! vfs_read + 0x9d - [unknown source file]
> vmlinux ! SyS_read + 0x7e - [unknown source file]
> vmlinux ! tracesys + 0xdc - [unknown source file]
> libpthread-2.17.so ! __read + 0x2c - [unknown source file]
> impalad ! google::OpenObjectFileContainingPcAndGetStartAddress + 0x2cf - [unknown source
file]
> impalad ! google::SymbolizeAndDemangle + 0x488 - [unknown source file]
> impalad ! google::DumpStackTrace.constprop.9 + 0x6e - [unknown source file]
> impalad ! impala::GetStackTrace + 0x1e - debug-util.cc:321
> impalad ! impala::Status::Status + 0x68 - status.cc:114
> impalad ! impala::DeserializeThriftMsg<parquet::PageHeader> + 0x2ce - thrift-util.h:128
> impalad ! impala::BaseScalarColumnReader::ReadPageHeader + 0x182 - parquet-column-readers.cc:787
> impalad ! impala::BaseScalarColumnReader::ReadDataPage + 0x9b - parquet-column-readers.cc:966
> impalad ! impala::BaseScalarColumnReader::NextPage + 0x6e - parquet-column-readers.cc:1093
> impalad ! ReadValueBatch<false> + 0x196 - parquet-column-readers.cc:306
> impalad ! impala::ScalarColumnReader<long, (bool)1>::ReadNonRepeatedValueBatch
+ 0x60 - parquet-column-readers.cc:253
> impalad ! impala::HdfsParquetScanner::AssembleRows + 0x313 - hdfs-parquet-scanner.cc:950
> impalad ! impala::HdfsParquetScanner::GetNextInternal + 0x277 - hdfs-parquet-scanner.cc:482
> impalad ! impala::HdfsParquetScanner::ProcessSplit + 0x73 - hdfs-parquet-scanner.cc:409
> impalad ! impala::HdfsScanNode::ProcessSplit + 0x395 - hdfs-scan-node.cc:526
> impalad ! impala::HdfsScanNode::ScannerThread + 0xbb8 - hdfs-scan-node.cc:417
> impalad ! boost::function0<void>::operator() + 0x1a - function_template.hpp:767
> {code}
> What gets printed to the log
> {code}
> I0501 14:13:24.154422 21797 status.cc:119] couldn't deserialize thrift msg:
> No more data to read.
>    @           0x838a89  impala::Status::Status()
>    @           0xc955af  impala::DeserializeThriftMsg<>()
>    @           0xc8c973  impala::BaseScalarColumnReader::ReadPageHeader()
>    @           0xc8e0fc  impala::BaseScalarColumnReader::ReadDataPage()
>    @           0xc8ee6f  impala::BaseScalarColumnReader::NextPage()
>    @           0xc9d5a8  impala::ScalarColumnReader<>::ReadNonRepeatedValueBatch()
>    @           0xc72764  impala::HdfsParquetScanner::AssembleRows()
>    @           0xc77238  impala::HdfsParquetScanner::GetNextInternal()
>    @           0xc71e74  impala::HdfsParquetScanner::ProcessSplit()
>    @           0xc4eab6  impala::HdfsScanNode::ProcessSplit()
>    @           0xc50719  impala::HdfsScanNode::ScannerThread()
>    @           0xbdd739  impala::Thread::SuperviseThread()
>    @           0xbde184  boost::detail::thread_data<>::run()
>    @           0xe4f3ba  (unknown)
>    @       0x3797407aa1  (unknown)
>    @       0x37970e893d  (unknown)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message