drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@dremio.com>
Subject Re: Directory and file based partition pruning
Date Thu, 10 Sep 2015 05:39:31 GMT
Makes sense.

Is there we can do this with lazy materializations rather than writing
complex expression tree logic? I hate have no all this custom expression
tree manipulation logic.

Also, it seems like this should be N phased rather than two phase where N
is the number of directories below the base path.

Thoughts?
On Sep 9, 2015 10:54 AM, "Aman Sinha" <amansinha@apache.org> wrote:

> Currently, partition pruning gets all file names in the table and applies
> the pruning.  Suppose the files are spread out over several directories and
> there is a filter  on dirN,  this is not efficient - both in terms of
> elapsed time and memory usage.  This has been seen in a few use cases
> recently.
>
> We should ideally perform the pruning in 2 steps:  first get the top-level
> directory names only and apply the directory filter, then get the filenames
> within that directory and apply remaining filters.
>
> I will create a JIRA for this enhancement but let me know your thoughts...
>
> Aman
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message