impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Russell (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-3200: [DOCS] Document user-facing aspects of new buffer pool
Date Thu, 05 Oct 2017 00:27:39 GMT
John Russell has posted comments on this change. ( http://gerrit.cloudera.org:8080/8003 )

Change subject: IMPALA-3200: [DOCS] Document user-facing aspects of new buffer pool
......................................................................


Patch Set 4:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml
File docs/topics/impala_buffer_pool_limit.xml:

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@39
PS4, Line 39:       Defines the size in bytes of an internal buffer for allocating memory
during queries.
> First sentence isn't accurate. Something like this might be better: "Define
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@53
PS4, Line 53: MiB
MB. I may have mixed up MB / KB and MiB / KiB in a few places throughout. MB / KB = powers
of 2, MiB / KiB = powers of 10.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@53
PS4, Line 53: his default is represented by a value
            :       of 0.
> The default is actually that the query option is unset. 0 means a limit of 
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_buffer_pool_limit.xml@69
PS4, Line 69: set buffer_pool_limit=8GB;
> It can also be expressed as a percentage of mem_limit (e.g. 80%), which may
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml
File docs/topics/impala_default_spillable_buffer_size.xml:

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@44
PS4, Line 44:     <p conref="../shared/impala_common.xml#common/type_integer"/>
> Do we document somewhere the supported byte suffixes (e.g. B, KB, MB, GB pl
Done. I'll reuse wording to that effect from the MEM_LIMIT page. I trust that of these new
query options, only BUFFER_LIMIT accepts a percentage, the others are purely sizes in bytes,
correct?


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@49
PS4, Line 49:     <p>
> I think it's confusing to say that the default varies, since the query opti
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@50
PS4, Line 50:       <codeph>65536</codeph> to <codeph>2097152</codeph>,
depending on the
> Feel free to ignore, but might be easier to read if they were human-readabl
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@59
PS4, Line 59:       consider increasing the <codeph>DEFAULT_SPILLABLE_BUFFER_SIZE</codeph>
setting.
> Would it be helpful to elaborate here? E.g. "Larger buffer sizes will resul
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_default_spillable_buffer_size.xml@61
PS4, Line 61:     </p>
> It's implied but would it make sense to state the inverse. I.e. decreasing 
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml
File docs/topics/impala_max_row_size.xml:

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@47
PS4, Line 47: 524288
> 512K to be human-readable?
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@56
PS4, Line 56: accomodate
> sp: accommodate
That and 'separate' vs. 'seperate' are my kryptonite.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@57
PS4, Line 57:       the total bytes stored in the largest row.
> It's left ambiguous whether queries processing rows larger than MAX_ROW_SIZ
Good point. In my testing I tried to come close to the boundary with a 530 K row size and
was surprised that the query succeeded.

Oops, I see this ties in to the initial description of the option purpose, which I left blank
by mistake. I'll fill that in above.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@82
PS4, Line 82: 500 KB string and a 30 KB
These are actually KiB. I didn't mean to actually cut it so close (530,000 vs. 524,288 limit).
I couldn't make this slightly-too-large row cause a failure.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@95
PS4, Line 95: KiB
Agh, KiB again where it should be KB.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_max_row_size.xml@109
PS4, Line 109: Increase the max_row_size query option (currently 512.00 KB) to process larger
rows.
Wrap these very long error message lines.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_min_spillable_buffer_size.xml
File docs/topics/impala_min_spillable_buffer_size.xml:

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_min_spillable_buffer_size.xml@38
PS4, Line 38:       <indexterm audience="hidden">MIN_SPILLABLE_BUFFER_SIZE query option</indexterm>
> I think most of the comments I made on DEFAULT_SPILLABLE_BUFFER_SIZE have a
Done. Please doublecheck whether the wording I lifted from the DEFAULT_ page also makes sense
here. E.g. is it even possible to lower the MIN_ setting to less than 64 KB, or could you
only ever increase it?


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml
File docs/topics/impala_scalability.xml:

http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@425
PS4, Line 425: allocated
> it's reserved at the beginning of the query, not allocated. The distinction
Done


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@560
PS4, Line 560:         The infrastructure of the spilling feature affects the way the affected
SQL operators, such as
> A lot of the discussion in this paragraph and the one below is describing t
OK, I will summarize the new situation and then comment out the old discussion (in case we
need to rescue any details from it later.)


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@615
PS4, Line 615: >BlockMgr.BytesWritten
> This still needs updating - IMPALA-5656 - should I keep an eye out for a di
Nah, there's only a single reference to this name so I'll fix IMPALA-5656 as part of this
gerrit.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@701
PS4, Line 701:         <b>Testing performance implications of spilling to disk:</b>
> This section is stale too. The scenario still seems valid but the profile o
OK, I'll skip over the output piece and see if users ask for more details to be added back.


http://gerrit.cloudera.org:8080/#/c/8003/4/docs/topics/impala_scalability.xml@745
PS4, Line 745: Do not specify a memory limit lower than about 300 MB, because with such a
             :         low limit, queries could fail to start for other reasons. Now try the
memory-intensive query again.
> This is one of the things that was fixed :)
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/8003
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49323f8ffbff3e195058e88762eedbb1fcb1bc0e
Gerrit-Change-Number: 8003
Gerrit-PatchSet: 4
Gerrit-Owner: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: John Russell <jrussell@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokhtar@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Oct 2017 00:27:39 +0000
Gerrit-HasComments: Yes

Mime
  • Unnamed multipart/alternative (inline, 8-Bit, 0 bytes)
View raw message