impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bharath Vissapragada (Code Review)" <ger...@cloudera.org>
Subject [Impala-CR](cdh5-trunk) IMPALA-3680: Cleanup the scan range state after failed hdfs cache reads
Date Thu, 30 Jun 2016 17:17:58 GMT
Bharath Vissapragada has submitted this change and it was merged.

Change subject: IMPALA-3680: Cleanup the scan range state after failed hdfs cache reads
......................................................................


IMPALA-3680: Cleanup the scan range state after failed hdfs cache reads

Currently we don't reset the file read offset if ZCR fails. Due to
this, when we switch to the normal read path, we hit the eosr of
the scan-range even before reading the expected data length. If both
the ReadFromCache() and ReadRange() calls fail without reading any
data, we end up creating a whole list of scan-ranges, each with size
1KB (DEFAULT_READ_PAST_SIZE) assuming we are reading past the scan
range. This gives a huge performance hit. This patch just calls
ScanRange::Close() after the failed cache reads to clean up the
file system state so that the re-reads start from beginning of
the scan range.

This was hit as a part of debugging IMPALA-3679, where the queries
on 1gb cached data were running ~20x slower compared to non-cached
runs.

Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Reviewed-on: http://gerrit.cloudera.org:8080/3313
Reviewed-by: Michael Brown <mikeb@cloudera.com>
Tested-by: Bharath Vissapragada <bharathv@cloudera.com>
---
M be/src/runtime/disk-io-mgr-scan-range.cc
M testdata/cluster/node_templates/common/etc/hadoop/conf/hdfs-site.xml.tmpl
M tests/query_test/test_hdfs_caching.py
3 files changed, 83 insertions(+), 5 deletions(-)

Approvals:
  Michael Brown: Looks good to me, approved
  Bharath Vissapragada: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/3313
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0a9ea19dd8571b01d2cd5b87da1c259219f6297a
Gerrit-PatchSet: 10
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Bharath Vissapragada <bharathv@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Michael Brown <mikeb@cloudera.com>

Mime
View raw message