impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-6137: fix text scanner split delim mem mgmt
Date Wed, 01 Nov 2017 21:10:37 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8438 )

Change subject: IMPALA-6137: fix text scanner split delim mem mgmt
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8438/2/be/src/exec/hdfs-text-scanner.cc@717
PS2, Line 717: byte_buffer_last_byte_
> given that this doesn't happen when readsize is 0, is it possible to use a 
The byte_buffer_read_size_ check above prevents using a stale value from a previous byte buffer.

There's a related scenario where I can't convince myself that it is or isn't possible: if
the last FillByteBuffer() call filled zero bytes but the call before that had a split delimiter
at the end of the buffer, we could potentially ignore the split delimiter, even though it
was the last thing we read from the range. We could fix that by tracking whether *any* bytes
were read - if so, it means that byte_buffer_last_byte_ is valid and is the last byte read.

What do you think? Should I try to more explicitly prevent that scenario? I don't think this
change makes it more or less likely to hit.



-- 
To view, visit http://gerrit.cloudera.org:8080/8438
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iddbb5cf6acc8f0b0e0b4c205c334f21e03d06f1c
Gerrit-Change-Number: 8438
Gerrit-PatchSet: 2
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: anujphadke <aphadke@cloudera.com>
Gerrit-Comment-Date: Wed, 01 Nov 2017 21:10:37 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message