impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jim Apple (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-889: Add support for an ISO-SQL compliant trim() function.
Date Tue, 04 Oct 2016 03:46:44 GMT
Jim Apple has posted comments on this change.

Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function.
......................................................................


Patch Set 6:

(32 comments)

http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG
Commit Message:

Line 10:   specified direction(s) of a STRING value.
> Greetings, Jim.
Then change the name of the new function away from btrim to something else.


http://gerrit.cloudera.org:8080/#/c/4474/6//COMMIT_MSG
Commit Message:

PS6, Line 14: Syntax #3 confirms the standard SQL syntax (Core SQL feature ID
All of them are part of the standard, as is simply TRIM(string_to_be_trimmed).

http://savage.net.au/SQL/sql-92.bnf.html


PS6, Line 18: |
just use regular prose here: Valid options are "leading", "trailing", and "both"


PS6, Line 22: empty argument
What does "empty argument" mean in this case?


PS6, Line 22: NULL
How is a NULL passed? Is it literally just "trim(NULL from 'foo')"?


PS6, Line 23: returns
not just "returns", "TRIM returns". Here and below.


PS6, Line 25: "trim_char_set": Case-sensitive characters to be removed. This
Please separate paragraphs by a blank line.


PS6, Line 29: " "
Is the standard space or all whitespace, including tabs, newlines, etc.?


PS6, Line 32: Blank
How is a "blank" argument different than an "empty argument" above?


PS6, Line 33: ote: For Syntax #1, since no "characters" is specified, it trims
            : " "(space) by default. For Syntax #2, since no "where" is specified,
            : the option both is implied by default.
Please split this up and put it above in the text on each option.


PS6, Line 44: @&@&@&@
This is hard to read. Can you stick to alphanumeric characters in these examples, please?
The smaller the input that illustrates the point, the clearer it is.

This applies to tests, as well.


http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

Line 1977:   TestStringValue("trim(leading FROM '     &@&127+  &@   ')", "&@&127+
 &@   ");
Please try to have each test case test one thing different from every other test case. What
a case is testing should be clear by looking at it. The smaller the test case is, the better.

As an example, would this test case be better as "(trim leading FROM ' a b ')"?


http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

Line 788: static inline bitset<256>* TrimInternalPrepare(FunctionContext* ctx,
Please use unnamed namespaces; see https://google.github.io/styleguide/cppguide.html


PS6, Line 805: bitset
const


PS6, Line 806: &
Out parameters, when used, are not references. https://google.github.io/styleguide/cppguide.html#Reference_Arguments


PS6, Line 806: begin
prefer to return, not pass out parameters.


PS6, Line 807: test
Please use [], not test - we don't need the bounds checking.


PS6, Line 807: static_cast<int>
What type is str.ptr[begin]? What type does test take? Do you need this cast? Why or why not?


PS6, Line 813: int &
begin is not modified.


PS6, Line 822: bitset
const


PS6, Line 824: int32_t
Don't use int32_t and int interchangeably without a specific reason


PS6, Line 835: str
The original functions made a new string val, rather than returning this one. Do you know
why?


Line 848:     option.ptr[i] = std::tolower(option.ptr[i]);
Though awkward, please insert the required casts here.


PS6, Line 851: TrimOption
This should be done at parsing/analysis time, for efficiency reasons


PS6, Line 855: INVALID
LEADING = 1, TRAILING = 2, BOTH = LEADING | TRAILING, INVALID = ~BOTH. The comparison on line
886 gets simplified.


PS6, Line 858: opt_
https://google.github.io/styleguide/cppguide.html#Variable_Names


PS6, Line 861: ( 
Please use clang-format to format your code.


PS6, Line 861: 7
try to avoid repeating literals, like 7. Maybe set static constexpr char LEADING[] = "leading",
then check if option.len == sizeof(LEADING)-1? See how that looks.


PS6, Line 863: both
bad copy


PS6, Line 866: both
bad copy


http://gerrit.cloudera.org:8080/#/c/4474/1/be/src/exprs/string-functions.h
File be/src/exprs/string-functions.h:

PS1, Line 123: Base6
And while you're changing the name, no need to call this BTrim.


http://gerrit.cloudera.org:8080/#/c/4474/1/common/function-registry/impala_functions.py
File common/function-registry/impala_functions.py:

PS1, Line 478: ezza 
This is the string you should change. Change it to something else, like sql92trim, then make
that invisible.


-- 
To view, visit http://gerrit.cloudera.org:8080/4474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Youwei Wang <youwei.a.wang@intel.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Youwei Wang <youwei.a.wang@intel.com>
Gerrit-HasComments: Yes

Mime
View raw message