hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-13616) Batch listing of multiple directories
Date Fri, 25 May 2018 16:55:00 GMT

    [ https://issues.apache.org/jira/browse/HDFS-13616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16491002#comment-16491002
] 

Andrew Wang commented on HDFS-13616:
------------------------------------

Thanks for taking a look, Xiao and Aaron!

bq. We currently FNFE on the first error. Is it possible a partition is deleted while another
thread is listing halfway for Hive/Impala? What's the expected behavior from them if so? (I'm
lacking the knowledge of this so no strong preference either way, but curious...)

This case is somewhat addressed by the the unit test listSomeDoNotExist, you'll see that the
get() method throws if there was an exception but you can still get results from other listing
batches returned by the iterator.

If you're talking about listing a single large directory and the directory gets deleted during
the listing, then yea this API will throw an FNFE like the existing RemoteIterator<FileStatus>
API. Paged listings aren't atomic.

bq. If caller added some subdirs to srcs, should we list the subdir twice, or throw, or 'smartly'
list everything at most once?

This is addressed by the unit test listSamePaths, it lists it multiple times. I didn't see
it as the role of the filesystem to coalesce these paths, semantically I wanted it to behave
like the existing RemoteIterator<FileStatus> API called in a for loop.

Aaron, I'll hit your review comments in a new patch rev. Precommit is getting pretty close,
so I'm hoping to coalesce review comments from others before posting the next one.

bq. Why not just RemoteIterator<FileStatus>?

We need an entry point to throw an exception for a single path that doesn't kill the entire
listing. From a client POV, it's also nice to have the same path passed in provided back,
since the HDFS returns back absolute, qualified paths. It also makes it easier to understand
the empty directory case.

I attached the benchmark I ran for further examination. I think you correctly answered the
usecase question yourself, but to confirm: the Hive/Impala client already has a list of leaf
directories to list, so it'd require some contortions to use a recursive API like listFiles
instead. I imagine a server-side listFiles (like what S3 has) would be a nice speedup though.

> Batch listing of multiple directories
> -------------------------------------
>
>                 Key: HDFS-13616
>                 URL: https://issues.apache.org/jira/browse/HDFS-13616
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>    Affects Versions: 3.2.0
>            Reporter: Andrew Wang
>            Assignee: Andrew Wang
>            Priority: Major
>         Attachments: BenchmarkListFiles.java, HDFS-13616.001.patch, HDFS-13616.002.patch
>
>
> One of the dominant workloads for external metadata services is listing of partition
directories. This can end up being bottlenecked on RTT time when partition directories contain
a small number of files. This is fairly common, since fine-grained partitioning is used for
partition pruning by the query engines.
> A batched listing API that takes multiple paths amortizes the RTT cost. Initial benchmarks
show a 10-20x improvement in metadata loading performance.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message