impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5307: Part 2: copy out strings in uncompressed Avro
Date Mon, 30 Oct 2017 22:03:05 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8146 )

Change subject: IMPALA-5307: Part 2: copy out strings in uncompressed Avro
......................................................................


Patch Set 15:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc
File be/src/exec/hdfs-avro-scanner-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc@a213
PS15, Line 213: 
> If I understand it correctly, this if branch was dead code before this chan
Yeah I missed cleaning it up in an earlier commit. I mention this in my admittedly gigantic
commit message.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner-ir.cc@51
PS15, Line 51: !tuple->CopyStrings("HdfsAvroScanner::DecodeAvroData()",
             :               state_, string_slot_offsets_.data(), string_slot_offsets_.size(),
pool,
             :               &parse_status_))
> nit: tuple->CopyStrings(...) == nullptr
It returns a bool though, since it doesn't reallocate the tuple itself.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.h
File be/src/exec/hdfs-avro-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.h@133
PS15, Line 133: //
> nit: /// to be consistent.
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.cc
File be/src/exec/hdfs-avro-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-avro-scanner.cc@1066
PS15, Line 1066: HdfsScanNodeBase* node
> Should this be const HdfsScanNodeBase* ?
Done. this required propagating the const qualifier a few more places.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.h
File be/src/exec/hdfs-scanner.h:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.h@406
PS15, Line 406:   /// Codegen function to replace InitTuple(). The codegen'd version of InitTuple()
is
              :   /// stored in 'init_tuple_fn' if codegen was successful.
> May help to also state the codegen'd version of the function has some const
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.cc
File be/src/exec/hdfs-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/exec/hdfs-scanner.cc@535
PS15, Line 535:       
> nit: indent 4
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h@93
PS15, Line 93: if
> is
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/descriptors.h@109
PS15, Line 109: llvm::Constant* ToIR(LlvmCodeGen* codegen) const;
> Comment: This needs to be updated should the layout of this struct change.
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple-ir.cc
File be/src/runtime/tuple-ir.cc:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple-ir.cc@28
PS15, Line 28: for (int i = 0; i < num_string_slots; ++i) {
> Not sure if it will help but have you tried #pragma unroll hint here to see
I tried a couple of queries but didn't see a noticeable difference in perf.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h
File be/src/runtime/tuple.h:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h@47
PS15, Line 47: /// Generate an LLVM Constant containing the offset values of this SlotOffsets
instance.
> Please also comment that this function needs to be updated if the layout of
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.h@204
PS15, Line 204: //
> nit: ///
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc
File be/src/runtime/tuple.cc:

http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@404
PS15, Line 404: materialize_strings_fn
> nit: Using the name copy_strings_fn will be more consistent.
Thanks for catching this, I missed this one place.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@435
PS15, Line 435: Constant*
> Not your change but I feel it's generally less confusing to include llvm:: 
I removed the "using namespace llvm" in this file and added llvm:: to the appropriate places.


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@435
PS15, Line 435: slot_offset_constants
> nit: 'slot_offset_ir_constants' may make it easier to follow.
Done


http://gerrit.cloudera.org:8080/#/c/8146/15/be/src/runtime/tuple.cc@436
PS15, Line 436:   for (SlotDescriptor* slot_desc : desc.string_slots()) {
              :     SlotOffsets offsets = {slot_desc->null_indicator_offset(), slot_desc->tuple_offset()};
              :     slot_offset_constants.push_back(offsets.ToIR(codegen));
              :   }
              : 
              :   Constant* constant_slot_offsets = codegen->ConstantsToGVArrayPtr(
              :       slot_offsets_type, slot_offset_constants, "slot_offsets");
              :   Constant* num_string_slots =
              :       ConstantInt::get(codegen->int_type(), desc.string_slots().size());
> I think it may be helpful to add a comment on what line 435 - 444 is trying
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8146
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If1fc78790d778c874f5aafa5958c3c045a88d233
Gerrit-Change-Number: 8146
Gerrit-PatchSet: 15
Gerrit-Owner: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarshall@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Mon, 30 Oct 2017 22:03:05 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message