impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Skye Wanderman-Milne (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3659: handle more error cases in ReadWriteUtil::ReadZLong()
Date Fri, 03 Jun 2016 20:53:28 GMT
Skye Wanderman-Milne has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/3305

Change subject: IMPALA-3659: handle more error cases in ReadWriteUtil::ReadZLong()
......................................................................

IMPALA-3659: handle more error cases in ReadWriteUtil::ReadZLong()

This patch modifies ReadZLong() to check for truncated encoded ints and
overly long encoded ints. It also adds new unit test cases.

I ran a local benchmark using the following query:
  set num_scanner_threads=1;
  select max(i) from default.avro_ints_big;

where avro_ints_big is an Avro table with a single int column
containing ~90MM values. With this patch plus the previous IMPALA-3441
patch, the total query time compared to trunk goes from 1.6s to 1.9s
(19% increase), with the MaterializeTupleTime going from 975ms to
1275ms (31% increase).

Change-Id: I436391a40285d9a6bcef2d112d256d0aa734b888
---
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/read-write-util.cc
M be/src/exec/read-write-util.h
M be/src/exec/zigzag-test.cc
M common/thrift/generate_error_codes.py
6 files changed, 148 insertions(+), 50 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/05/3305/1
-- 
To view, visit http://gerrit.cloudera.org:8080/3305
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I436391a40285d9a6bcef2d112d256d0aa734b888
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Skye Wanderman-Milne <skye@cloudera.com>

Mime
View raw message