Mailing-List: contact reviews-help@impala.incubator.apache.org; run by ezmlm
Precedence: bulk
Message-Id: <201705192130.v4JLU5qt007521@ip-10-146-233-104.ec2.internal>
Date: Fri, 19 May 2017 21:30:05 +0000
From: "Joe McDonnell (Code Review)" <gerrit@cloudera.org>
To: impala-cr@cloudera.com, reviews@impala.incubator.apache.org
CC: Dan Hecht <dhecht@cloudera.com>, Marcel Kornacker <marcel@cloudera.com>,
        Tim Armstrong <tarmstrong@cloudera.com>
Reply-To: joemcdonnell@cloudera.com
Subject: =?UTF-8?Q?=5BImpala-ASF-CR=5D_IMPALA-4623=3A_Enable_file_handle_cache=0A?=
In-Reply-To: <gerrit.1490402401000.Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d@gerrit.cloudera.org>
References: <gerrit.1490402401000.Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d@gerrit.cloudera.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Content-Disposition: inline
User-Agent: Gerrit/2.12.7
archived-at: Fri, 19 May 2017 21:30:10 -0000

Joe McDonnell has posted comments on this change.

Change subject: IMPALA-4623: Enable file handle cache
......................................................................


Patch Set 8:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.h
File be/src/runtime/disk-io-mgr-handle-cache.h:

Line 88:   /// with a call to ReleaseFileHandle to release exclusive control.
> this should also evict handles if a more recent mtime for a file with exist
Added a TODO for this. I plan on dealing with this when doing eviction by timeout.

Here is a scenario that I might be concerned about for immediate eviction when seeing a higher mtime:

1. ScanRange #1 starts up
2. ScanRange #1 does a read, putting a file handle with mtime=5 in the cache.
3. The file is modified and now has an mtime=6.
4. ScanRange #2 starts up
5. ScanRange #2 does a read, putting a file handle with mtime=6 in the cache and removing the mtime=5 file handle.
6. ScanRange #1 wants to do another read and goes to get a file handle from the cache, but mtime=5 file handles aren't available. What happens now? It could return an error, which makes some sense. It could open a new file handle, but it would be looking at the mtime=6 file. The mtime=6 file might have different offsets, so it might be reading garbage. 

A lot of this is hypothetical, but it would be nice for a ScanRange to deal with a single version of a file. That is a property that the old code had.


Line 98:   static const uint32_t HASH_SEED = 0x87654321;
> where from?
No need for a non-zero hash seed for our purpose, so I have removed this and substituted zero in the code.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-handle-cache.inline.h
File be/src/runtime/disk-io-mgr-handle-cache.inline.h:

Line 65:   // Keep going until find unused entry with the same mtime
> fix grammar
Done


Line 113:   // to its place in the map
> add it?
I started working on this. The templating starts to infect everything. I will keep looking into it for a followup, but I think I will leave this as a TODO for the first version.


Line 117:     if (elem->fh.get() == fh) {
> single line
Done


Line 129:   // File can be unbuffered
> where does the corresponding buffering take place?
Added information in the comment. The buffering here is readahead buffering that Hdfs does for a read. That is what we are unbuffering.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr-scan-range.cc
File be/src/runtime/disk-io-mgr-scan-range.cc:

Line 417:       int last_read = -1;
> what does this mean? does it count anything, and if so, what?
This is the number of bytes read by either hdfsPread or hdfsRead. In case of an error, it returns -1. I changed the name to "last_bytes_read"


Line 429:             return Status(GetHdfsErrorMsg("Error seeking HDFS file: ", file_));
> also print position
Done


Line 435:         return Status(GetHdfsErrorMsg("Error reading from HDFS file: ", file_));
> leave todo that you need to retry once to deal with stale cache entries
Done


Line 549:         remote_bytes_ += stats->totalBytesRead - stats->totalLocalBytesRead;
> why can't remote_bytes_ go into reader_?
The only real obstacle is this piece of logging:

https://github.com/apache/incubator-impala/blob/master/be/src/runtime/disk-io-mgr-scan-range.cc#L341
(It is in a different location after this code change.)

The logging is printing the number of unexpected remote bytes for this scan range. If the remote bytes go the reader_, I don't have a way to separate it out.

If we don't need the logging, then the state on the ScanRange can be a boolean, and we can send remote bytes directly to reader_.


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.cc
File be/src/runtime/disk-io-mgr.cc:

Line 1300
> this comment disappeared.
Added a comment in the new location. Hdfs can do some readahead buffering, so we want to reduce this buffering when putting a file handle in the cache.


Line 278:     cached_read_options_(nullptr),
> move into .h file and remove here
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/be/src/runtime/disk-io-mgr.h
File be/src/runtime/disk-io-mgr.h:

Line 498:     int64_t remote_bytes_;
> num_...
Done


http://gerrit.cloudera.org:8080/#/c/6478/8/tests/query_test/test_hdfs_fd_caching.py
File tests/query_test/test_hdfs_fd_caching.py:

Line 74:     self.execute_query_expect_success(self.client, count_query)
> this should give some expected result
Done


Line 99:     self.execute_query_expect_success(self.client, count_query)
> you should check for the expected result, depending on the modification typ
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6478
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5ff60971dd653c3b6a0e13928cfa9fc59d078d
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonnell@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <marcel@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstrong@cloudera.com>
Gerrit-HasComments: Yes