impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5307: Part 4: copy out uncompressed text and seq
Date Tue, 07 Nov 2017 03:19:16 GMT
Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/8172 )

Change subject: IMPALA-5307: Part 4: copy out uncompressed text and seq
......................................................................


Patch Set 12:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc
File be/src/exec/hdfs-scanner-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner-ir.cc@61
PS11, Line 61: return 0;
Is there a reason why we don't treat this as parse error and returns -1 ?


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.h@347
PS11, Line 347:   /// Returns -1 if parsing should be aborted due to parse errors.
May help to also add "Returns 0 if copying strings into 'pool' failed'.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-scanner.cc@313
PS11, Line 313: TextConverter::CodegenWriteSlot
> I did think about doing that (and prototyped a similar approach with Avro),
Good point. Thanks for adding the comment.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc
File be/src/exec/hdfs-sequence-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-sequence-scanner.cc@a282
PS11, Line 282: 
It may be trivial but does it make sense to add DCHECK(row_batch->num_tuples_per_row(),
1) somewhere ? That seems to be the assumption WriteAlignedTuple() is making when updating
the tuple_row_mem ptr.


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc
File be/src/exec/hdfs-text-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@849
PS11, Line 849: tuples_returned
There is a subtle behavior here if Tuple::CopyStrings() failed in WriteAlignedTuples(). In
that case, tuples_returned here is 0 but we will continue to line 865 below although we would
still deduct num_fields at line 857 below. It acts as if num_tuples * scan_node_->materialized_slots().size()
worth of slots were ignored. Is there a reason why we should continue in this case ? Please
also see comments on the return value at WriteAlignedTuples().


http://gerrit.cloudera.org:8080/#/c/8172/11/be/src/exec/hdfs-text-scanner.cc@857
PS11, Line 857: num_tuples
Why is this not tuples_returned here ? Is there any guarantee max_added_tuples >= num_tuples
?



-- 
To view, visit http://gerrit.cloudera.org:8080/8172
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I304fd002b61bfedf41c8b1405cd7eb7b492bb941
Gerrit-Change-Number: 8172
Gerrit-PatchSet: 12
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Tue, 07 Nov 2017 03:19:16 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message