impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zoltan Borok-Nagy (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3703: Store query context in thread-local variables
Date Tue, 28 Nov 2017 21:31:50 GMT
Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/8621
)

Change subject: IMPALA-3703: Store query context in thread-local variables
......................................................................


Patch Set 4:

(9 comments)

Thanks for the comments.
I'll continue working on this after further discussions.

http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h
File be/src/common/thread-info.h:

http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@40
PS4, Line 40: class InfoTable {
> Is this really a table? It looks more like some string buffer, or map. Mayb
Yeah, it does sound like a more reasonable name.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@56
PS4, Line 56: AddExtraInfo
> Will we always have (key,value) pairs? Since space is scarce here, I'd sugg
The idea was to store some description of the thread that is easy to read from a debugger.
Maybe we could store lines separated by newline characters, giving more freedom to the users.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@74
PS4, Line 74: instance_id
> nit: missing _
oops, will fix it in next commit.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@87
PS4, Line 87:  const char* GetExtraInfo() const { return text_; }
            :   const char* GetQueryId() const { return query_id_; }
            :   const char* GetInstanceId() const { return instance_id_; }
            :   const char* GetThreadName() const { return thread_name_; 
> where are these used?
Currently nowhere. I thought they will be useful somewhere. Maybe for prefixing log messages
with thread name or instance id.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@103
PS4, Line 103:   static constexpr int64_t MAX_TEXT_SIZE = 4096;
> Is there a reason why you chose this size? If Breakpad estimates that the o
I wanted to choose a size that will be large enough for the purpose, so I chose the linux
page size.
But, it might be too large then, I can reduce it, or I can eliminate this free text info feature
and only the special-purpose members would remain.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@104
PS4, Line 104:   static constexpr int64_t TUNIQUE_ID_STRING_SIZE = 64;
> unique id's are usually printed into 33 characters. Is there a reason you c
I wanted it to be power of two, and also thought that it might be getting bigger in the future.
Note that query_id_ and instance_id_ are strings only because it is easier to read during
a debug session, and also if we want to prefix the log messages with the instance id later,
then maybe it's good that we have it as a string already.
But maybe storing them as TUniqueIds would also be feasible.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@110
PS4, Line 110:   char query_id_[TUNIQUE_ID_STRING_SIZE] = {};
> The fragment instance ID is actually the query id plus an index (see Impala
I didn't know that, OK, will get rid of it in the next commit.


http://gerrit.cloudera.org:8080/#/c/8621/4/be/src/common/thread-info.h@112
PS4, Line 112:   char thread_name_[THREAD_NAME_SIZE] = {};
> We already have a copy of the thread name in the same scope as the thread-l
If you think of the name_copy variable, note that it is a std::string. It only contains the
thread name on the stack if it is small enough and Small String Optimization is used. Otherwise,
it won't appear in the minidump.


http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/8621/2/be/src/runtime/query-state.cc@378
PS2, Line 378:   thread_info_table->SetQueryId(fis->query_id());
> Ultimately we're interested in what query a thread is working on and it wou
OK, I agree.
I think this parent-relation can work for thread pools also. The parent-child hierarchy would
represent the task-tree, not the thread creation tree (which is not that interesting if threads
are created in a pool).



-- 
To view, visit http://gerrit.cloudera.org:8080/8621
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I566f7f1db5117c498e86e0bd05b33bdcff624609
Gerrit-Change-Number: 8621
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Lars Volker <lv@cloudera.com>
Gerrit-Reviewer: Philip Zeyliger <philip@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Reviewer: Zoltan Borok-Nagy <boroknagyz@cloudera.com>
Gerrit-Comment-Date: Tue, 28 Nov 2017 21:31:50 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message