impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-2281: Replace FNV with FastHash in exchange nodes
Date Sat, 04 Nov 2017 00:28:16 GMT
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/8417 )

Change subject: IMPALA-2281: Replace FNV with FastHash in exchange nodes
......................................................................


Patch Set 2:

(12 comments)

Did a first pass over it.

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py
File be/src/codegen/gen_ir_descriptions.py:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/codegen/gen_ir_descriptions.py@99
PS2, Line 99:   ["HASH_FAST", "IrFastHash"],
Do we use this? I don't see any places where it's used currently.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc
File be/src/runtime/data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@a468
PS2, Line 468: 
Lol, weird.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/data-stream-sender.cc@474
PS2, Line 474: partition_expr_evals_[j
Use 'eval' directly.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc
File be/src/runtime/raw-value.cc:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/runtime/raw-value.cc@220
PS2, Line 220:       return HashUtil::FastHash64(v, static_cast<size_t>(type.GetByteSize()),
seed);
I think this might be slightly slower - with the previous approach I think we get a specialised
version of the hash function for that input length for each data type, whereas here we have
to go through another switch in GetByteSize().


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h
File be/src/util/hash-util.h:

http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@235
PS2, Line 235:   /* The MIT License
I think this should go at the top of the file beneath the apache license. Then we can just
say that the FastHash64 implementation came with that license. E.g.

/* 
  FastHash64 implementation derived from MIT-licensed code written by Zilong Tan

  The MIT License
  ...
*/


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@263
PS2, Line 263: size_t
use int64_t - we generally use signed integers for lengths.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@266
PS2, Line 266: (const uint64_t *)
We use c-style casts - i.e. reinterpret_cast<>


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@267
PS2, Line 267: pos
I believe the C++ standard doesn't allow pointer arithmetic on void - we should convert to
uint8_t for doing that.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@278
PS2, Line 278:  (const unsigned char*
See above comments.


http://gerrit.cloudera.org:8080/#/c/8417/2/be/src/util/hash-util.h@282
PS2, Line 282: (uint64_t)
should use static_cast<uint64_t>()


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test
File testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/functional-query/queries/QueryTest/nested-types-runtime.test@145
PS2, Line 145: 8,'k1',-1,'k1',1
Why does this make a difference when we have VERIFY_IS_EQUAL_SORTED above?


http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test
File testdata/workloads/tpcds/queries/tpcds-q77a.test:

http://gerrit.cloudera.org:8080/#/c/8417/2/testdata/workloads/tpcds/queries/tpcds-q77a.test@125
PS2, Line 125: 'catalog channel',NULL,538912.55,2050279.74,-1383554.73
I'll file a JIRA to make this test deterministically pass: IMPALA-6155. In the meantime, can
you change this to

   ---- RESULTS: VERIFY_IS_EQUAL_SORTED

That will make it ignore the order of results.



-- 
To view, visit http://gerrit.cloudera.org:8080/8417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I778317d982dcdb94173a369a65b39f32b4f7ded2
Gerrit-Change-Number: 8417
Gerrit-PatchSet: 2
Gerrit-Owner: Tianyi Wang <twang@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Sat, 04 Nov 2017 00:28:16 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message