drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <amansi...@apache.org>
Subject Directory and file based partition pruning
Date Wed, 09 Sep 2015 17:54:21 GMT
Currently, partition pruning gets all file names in the table and applies
the pruning.  Suppose the files are spread out over several directories and
there is a filter  on dirN,  this is not efficient - both in terms of
elapsed time and memory usage.  This has been seen in a few use cases

We should ideally perform the pruning in 2 steps:  first get the top-level
directory names only and apply the directory filter, then get the filenames
within that directory and apply remaining filters.

I will create a JIRA for this enhancement but let me know your thoughts...


  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message