impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tim Armstrong (Code Review)" <>
Subject [Impala-ASF-CR] IMPALA-3202,IMPALA-2298: rework scratch file I/O
Date Fri, 16 Dec 2016 17:28:49 GMT
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3202,IMPALA-2298: rework scratch file I/O

Patch Set 15:


I wasn't planning in the immediate future to expose the allocation policy, since that would
complicate the interface and not provide an immediate benefit. I don't think there's anything
that precludes us from doing it at a later point: we'd need to separate out the scratch allocator
from the I/O (E.g. FileGroup and TmpFileIoContext) and have a way to pass the scratch range
into Write(). Then probably have some kind of callback to allocate a new range if we hit an
Commit Message:

PS15, Line 7: IMPALA-2298
> this one is already marked as resolved. is it the right jira?
It was a duplicate - fixed

Line 54: the remaining data (72M) to /tmp.
> this might be a good py.test. what do you think?
I couldn't think of a way to achieve this without admin privileges to the machine, which we
can't assume we have. Users probably also wouldn't want our test suite messing around with
their fstab.
File be/src/runtime/buffered-block-mgr.h:

Line 177: 
> this could use a comment.
File be/src/runtime/tmp-file-mgr.h:

PS15, Line 45: File
> given that File is no longer exposed, make it "file". Or reword this now th

PS15, Line 48: a 
> delete or make singular

Line 51: /// data once the write finishes.
> does the read on WriteHandle block if the write wasn't complete yet, or how
Reworded to clarify how the callback fits in here.

PS15, Line 53: The files are
             : /// allocated lazily upon write.
> lazily with respect to what?  this also seems redundant with the first para

Line 56: /// WriteHandle.
> these two paragraphs kind of mix together the class abstraction and the imp
Tried to reorganise these paragraphs so they start off with abstraction then discuss implementation.

Line 57: ///
> also, reading through the code (though I haven't got through it all yet), t
That is TODO for the next patch - I didn't want to tackle recycling of variable-length scratch
ranges in this patch, which isn't needed for BufferedBlockMgr (its previous behaviour was
to always allocate max_block_size_ ranges).

Added a TODO up the top here.

Line 62:   class File;
> is this public declaration needed?
Yeah it's needed for TmpFileMgrTest unfortunately. There wasn't an easy workaround I could
find because of how gtest creates a separate class for each test.

Line 69:   typedef boost::function<void(const Status&)> WriteDoneCallback;
> how does the callback know which write was completed?
boost::function is really a function pointer plus a data payload, so whoever constructs the
function can include a pointer to whatever data they need.

I could rework as a plain function pointer with arguments (void* context, WriteHandle* handle,
const Status&) or similar - I was just copying DiskIoMgr's interface here.

Line 96:     /// data).
> this last sentence could use clarification that the rewrite is by the calle
Clarified the rewrite.

I intended the part about "the memory referenced must remain valid" to communicate the ownership
requirements. The buffer ownership isn't transferred. Also added an additional sentence to
make it explicit that callers shouldn't modify the buffer while the write is in flight.

PS15, Line 99:  
> be

PS15, Line 102: CancelAndRestoreData
> CancelWriteAndRestoreData
File be/src/util/buffer-ref.h:

Line 26: /// a separate pointer and length.
> does this go away once "everything" is switched to using BufferPool BufferH
No, since this could refer to a portion of a buffer (or really any range of memory).

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I8c9c587df006d2f09d72dd636adafbd295fcdc17
Gerrit-PatchSet: 15
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <>
Gerrit-Reviewer: Dan Hecht <>
Gerrit-Reviewer: Tim Armstrong <>
Gerrit-HasComments: Yes

View raw message