impala-reviews mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sailesh Mukil (Code Review)" <ger...@cloudera.org>
Subject [Impala-ASF-CR] IMPALA-5333: Add support for Impala to work with ADLS
Date Sat, 20 May 2017 01:01:43 GMT
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-5333: Add support for Impala to work with ADLS
......................................................................


Patch Set 3:

> I have a few high level questions about this patch. This patch
 > treats S3 and ADL the same way but after looking at the HDFS
 > classes and the ADL API I see some differences between ADL and S3
 > API. For instance, I don't see a way in ADL to recursively
 > enumerate all entries under a particular directory (something that
 > exists in S3). Our code today for S3 sort of relies on that
 > property so I am wondering if our code works as expected or if I am
 > missing something (quite possible). Also, the AdlFileSystem class
 > has functions for returning the block locations, which means that
 > we don't have to call the synthesize metadata calls as we do for
 > S3. In the long run, HDFS may expose replica locations from the ADL
 > local tiers (e.g. Cosmos), but I guess this is not relevant today.
 > Thoughts?

Very good questions, Dimitris.

- Could you point me to the exact API that does the recursive enumeration?

- It's true that we don't need to synthesize the block metadata. However, I was in conversation
with the ADLS connector folks and ADLS doesn't actually expose the block locations internally.
So if you look at the getFileBlockLocations() code in AdlFileSystem, it actually itself synthesizes
the block metadata:
http://github.mtv.cloudera.com/CDH/hadoop/blob/4da5de1a7c965f63f0b798e9ec4ec72ed7249f18/hadoop-tools/hadoop-azure-datalake/src/main/java/org/apache/hadoop/fs/adl/AdlFileSystem.java#L893

So functionally it's the same. But I agree, we should call the relevant API if there it is
"supported" in some form or the other. I will update the patch.

-- 
To view, visit http://gerrit.cloudera.org:8080/6910
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic56b9988b32a330443f24c44f9cb2c80842f7542
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-Reviewer: Attila Jeges <attilaj@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhecht@cloudera.com>
Gerrit-Reviewer: David Knupp <dknupp@cloudera.com>
Gerrit-Reviewer: Dimitris Tsirogiannis <dtsirogiannis@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sailesh@cloudera.com>
Gerrit-HasComments: No

Mime
View raw message