impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Date Fri, 03 Feb 2017 19:20:06 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5811/5//COMMIT_MSG
Commit Message:

Line 25:   RowIdx and larger pages with offsets that don't fit in 16 bits.
> how much is the memory density reduced?  in the worse case, does this use 2
The impact is mixed. 

In general density is hurt if tuples are mostly non-null. I think the worst case is 8x, if
the row has no materialised slots and therefore all memory is devoted to the bitstring.

My feeling is that adding an extra byte per row is acceptable, even if it's an 8x regression.

In other cases, where tuples are mostly null, memory density can improve more than 2x. E.g.
if you have a very wide nullable tuple, with many nulls, then the previous encoding is very
inefficient because it runs out of space in the fixed-size bit vector before it runs out of
space in the page.


-- 
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message