impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Michael Ho (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-4729: Implement REPLACE()
Date Tue, 31 Jan 2017 17:13:13 GMT
Michael Ho has posted comments on this change.

Change subject: IMPALA-4729: Implement REPLACE()
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5776/5/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

PS5, Line 249:  while (match_pos + needle.len <= haystack.len) {
             :       // Copy in original string
             :       const int bytes = match_pos - consumed;
             :       memcpy(ptr, &haystack.ptr[consumed], bytes);
             :       DCHECK_LE(ptr - result.ptr + bytes, buffer_space);
             :       ptr += bytes;
             : 
             :       // Copy in replacement - always safe since we always leave room for one
more replace
             :       DCHECK_LE(ptr - result.ptr + replace.len, buffer_space);
             :       memcpy(ptr, replace.ptr, replace.len);
             :       ptr += replace.len;
             : 
             :       // Don't want to re-match within already replaced pattern
             :       match_pos += needle.len;
             :       consumed = match_pos;
             : 
             :       // If we had an enlarging pattern, we may need more space
             :       if (UNLIKELY(check_space)) {
             :         const int bytes_produced = ptr - result.ptr;
             :         const int bytes_remaining = haystack.len - consumed;
             :         if (UNLIKELY(bytes_produced + bytes_remaining + replace.len > buffer_space))
{
             :           // Ran out of space, double the size.  See the note above regarding
the choice
             :           // of a power of two sized buffer.
             :           buffer_space <<= 1;
             :           DCHECK((buffer_space & (buffer_space - 1)) == 0);
             :           const auto ofs = ptr - result.ptr;
             :           if (UNLIKELY(!result.Resize(context, buffer_space))) {
             :             return StringVal::null();
             :           }
             :           ptr = result.ptr + ofs;
             :         }
             :       }
             : 
             :       StringValue haystack_substring = haystack.Substring(match_pos);
             :       int match_pos_in_substring = search.Search(&haystack_substring);
             :       if (match_pos_in_substring < 0) break;
             :       match_pos += match_pos_in_substring;
             :   }
> Is this more performant than simply using string::append() or string += ope
Please ignore my previous comment. We need to keep the memory allocations in a freepool. 
Ideally, we should use StringBuffer if possible but that interface works with MemPool only.
May be we can consider adapting its interface to take a Resize() function pointer instead
so it can be shared by both FreePool and MemPool.


-- 
To view, visit http://gerrit.cloudera.org:8080/5776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1780a7d8fee6d0db9dad148217fb6eb10f773329
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <zamsden@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.behm@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Michael Ho <kwho@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zach Amsden <zamsden@cloudera.com>
Gerrit-HasComments: Yes

Mime
View raw message