impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Impala Public Jenkins (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4539: fix bug when scratch batch references I/O buffers
Date Thu, 08 Dec 2016 04:38:31 GMT
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4539: fix bug when scratch batch references I/O buffers
......................................................................


IMPALA-4539: fix bug when scratch batch references I/O buffers

The bug occurs when scanning uncompressed plain-encoded Parquet
string data.

Testing:
I could manually reproduce it reliably with ASAN and
--disable_mem_pools=true using the following steps:

  # The repro is more predictable when there is no row batch queue.
  set mt_dop=1;
  # A unique string column should switch to plain encoding
  set compression_codec=none;
  create table big_uuids stored as parquet as select uuid() from tpch_20_parquet.lineitem;
  # The repro requires that some rows are filtered out, so that we end
  # up in a state where the output batch is full before all rows are
  # copied from the scratch batch
  select ndv(_c0) from big_uuids where substring(_c0, 1, 2) != 'a1' limit 10;

After the fix it no longer reproduces.

I do not yet have a practical test case that triggers the bug on a
normal ASAN setup. I will continue to try to create one.

Change-Id: Ic27e7251e0f633cb694b506f6eb62beed6e66ad9
Reviewed-on: http://gerrit.cloudera.org:8080/5406
Reviewed-by: Tim Armstrong <tarmstrong@cloudera.com>
Tested-by: Impala Public Jenkins
---
M be/src/exec/hdfs-parquet-scanner.cc
1 file changed, 10 insertions(+), 8 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/5406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic27e7251e0f633cb694b506f6eb62beed6e66ad9
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Michael Ho
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>

Mime
View raw message