impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3202: refactor scratch file management into TmpFileMgr
Date Mon, 07 Nov 2016 19:57:55 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202: refactor scratch file management into TmpFileMgr

Patch Set 1:

File be/src/runtime/

PS1, Line 308: EXPECT_TRUE
> ASSERT_TRUE? Otherwise we will hit a NPE in DeleteBlocks().
Makes sense. This pattern of always using EXPECT is unfortunately everywhere in the unit tests.
Mostly I think ASSERT makes the most sense. I just did a search-and-replace on this file to
change it everywhere in this file where I could (it's only allowed in functions that return
File be/src/runtime/

Line 246:       ->Init(state->io_mgr(), tmp_file_mgr, profile, parent, mem_limit, scratch_limit);
> This line break looks odd. If it was done by clang-format I'd keep it, but 
I agree that it looks weird. clang-format did it.
File be/src/runtime/

Line 282:   DCHECK_EQ(tmp_files_.size(), 0);
> nit: DCHECK(tmp_files_.empty());

PS1, Line 290: by ignoring the return status of
             :     // NewFile().
> we don't seem to actually ignore it, the comment looks wrong.
True, made the comment more accurate.

PS1, Line 296: spilling
> "No scratch directories..."? We seem to use 'tmp', 'scratch', and 'spilling
Good point, users have been confused by this before when the mixed terminology leaked into
error messages: IMPALA-3866.

We need to keep scratch as the user-facing term to match the command-line option, so I fixed
the "spilling" term here.

We could rename TmpFileMgr, etc to ScratchFileMgr, I'm unsure if it's worth the code churn
though - what do you think?

Line 335:       scratch_space_bytes_used_counter_->Add(num_bytes);
> can we handle current_bytes_allocated_ here, too? That's also the only dire
That works out cleaner, thanks.

Line 336:       return Status::OK();
> nit: this could just be "return status".
I prefer it this way since it's more obvious that it's an "successful" return path. I don't
feel that strongly but I find this easier to parse.

Line 351:     err_status.MergeStatus(errs[i]);
> nit: single line
File be/src/runtime/tmp-file-mgr.h:

PS1, Line 133:  
> nit: double space

PS1, Line 136: query_id
> does this actually have to be the query_id or could we rename it to somethi
I don't think it will ever be anything asides from the query id, but maybe "unique_id" makes
more sense.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I0c56c195f3f28d520034f8c384494e566635fc62
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Lars Volker <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message