impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dan Hecht (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3202,IMPALA-2079: rework scratch file I/O
Date Wed, 21 Dec 2016 00:16:21 GMT
Dan Hecht has posted comments on this change.

Change subject: IMPALA-3202,IMPALA-2079: rework scratch file I/O

Patch Set 20:

File be/src/runtime/

Line 633:       // The block's buffer is still in memory with the original data.
... but we couldn't get additional reservation. However, we can use 'release_block's reservation,
so reclaim the block's buffer.

or similar

Line 656: 
// FindBufferForBlock() wasn't able to find a buffer so transfer the one from 'release_block'.

Line 845:     if (block->write_handle_ != NULL) {
how could write_handle_ be NULL here?  anyway, okay with leaving this if-stmt alone for this
legacy code.
File be/src/runtime/buffered-block-mgr.h:

Line 363:   int64_t GetNumWritesOutstanding();
since the test class is a friend, can this be made private?

PS20, Line 403: pinned state
but this method doesn't actually set the pin bit on the block...
isn't this method really just undoing any side effects of an in-flight write? so maybe CancelInFlightWrite()
or CancelWrite() or RestoreBufferContents() or something?

PS20, Line 458: or
double or
File be/src/runtime/

PS20, Line 1167: with

Line 1173:                                      write_range->file_, errno, GetStrErrMsg())));
weird line break
File be/src/runtime/tmp-file-mgr-internal.h:

Line 83: 
> We could, but then we'd have to maintain a sparse map of 'disk_id' -> 'devi
is each explained somewhere? so disk_id_ is a disk-io-mgr concept, while DeviceId is an internal
to tmp-file-mgr concept to get a dense numbering of scratch disks, right? let's make sure
that's clear somewhere (maybe it is already in the main header).
File be/src/runtime/tmp-file-mgr-internal.h:

Line 41:   /// The physical file is created on the first call to AllocateSpace().
is that true?

PS20, Line 43:  

Line 93:   /// blacklisted, the corresponding device will always be blacklisted.
hmm i don't see how a device is blacklisted.
let's add some verification debug code to make sure these stay in sync.
File be/src/runtime/

Line 404:   // Don't hold 'lock_' outside AllocateSpace() to minimise contention.
not sure what this comment is telling me.

PS20, Line 499: Log the error before retrying so that the failure isn't swallowed silently.
is this referring to the old LogError() code, or the scratch_errors_?

Line 507:   handle->file_->ReportIOError(write_status.msg());
this does the file-level blacklist. But i don't see where the device-level blacklist happens...
File be/src/runtime/tmp-file-mgr.h:

Line 330:     /// so no locking is required. This is a terminal lock and should not be held
> I think it's an orthogonal issue - which is that all of error logging is do
There's nothing really about this class that makes it need to have query scope either, right?
anyway, great that we can remove it -- that's the best solution.
File be/src/runtime/tmp-file-mgr.h:

PS20, Line 40: local
delete, or say (usually local) or similar

PS20, Line 48: Temporary files are managed via a FileGroup. To write to a temporary file,
first a
             : /// FileGroup is created, then FileGroup::Write() is called to asynchronously
write a
             : /// memory buffer to disk.
This still kinda makes it sound like the user of this class can care about the file itself.
 How about saying:

FileGroups manage temporary file space across the multiple devices. To write to the temporary
file space, first a FileGroup is created, then ...

PS20, Line 78: It is used as a handle for external classes to identify devices.
if this is only used externally for tests, I think we should say something different.  it
sounds like the value of this is mostly internally to have a dense numbering of scratch devices.

PS20, Line 167: w
handle->write_in_flight_ ?

but those things happen in Handle::SetWriteComplete() so maybe reword.

Line 276:     void Cancel();
does this need to be exposed, or could all callers just use CancelWriteAndRestoreData()? This
isn't included as part of the lifecycle.

PS20, Line 300: SetWriteComplete
"Set" makes it sound like it just sets state.  Maybe just call this WriteComplete() too.
File be/src/util/buffer-ref.h:

PS20, Line 27: Ref
what does "Ref" stand for? Reference?  This might get confused with BufferHandle, which is
really the reference to a (buffer pool) buffer.  This is more like a memory range. Maybe "BufferRange"
or "BufRange" or MemRange?

And how about using this in SubAllocation? (can be another commit)
File be/src/util/disk-info.h:

Line 91:   static int num_datanode_dirs_;
dead code also?

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17
Gerrit-PatchSet: 20
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message