drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aman Sinha <asi...@maprtech.com>
Subject Re: Directory and file based partition pruning
Date Thu, 10 Sep 2015 23:42:39 GMT
Agree on the N phased approach.  I have filed a JIRA for the enhancement:
 DRILL-3759.
Regarding the simplification of the expression tree logic..did you mean the
logic in FindPartitionConditions  or the Interpreter ?
Perhaps you can add comments in the JIRA with some explanation.  I am in
favor of simplification where possible.

On Wed, Sep 9, 2015 at 10:39 PM, Jacques Nadeau <jacques@dremio.com> wrote:

> Makes sense.
>
> Is there we can do this with lazy materializations rather than writing
> complex expression tree logic? I hate have no all this custom expression
> tree manipulation logic.
>
> Also, it seems like this should be N phased rather than two phase where N
> is the number of directories below the base path.
>
> Thoughts?
> On Sep 9, 2015 10:54 AM, "Aman Sinha" <amansinha@apache.org> wrote:
>
> > Currently, partition pruning gets all file names in the table and applies
> > the pruning.  Suppose the files are spread out over several directories
> and
> > there is a filter  on dirN,  this is not efficient - both in terms of
> > elapsed time and memory usage.  This has been seen in a few use cases
> > recently.
> >
> > We should ideally perform the pruning in 2 steps:  first get the
> top-level
> > directory names only and apply the directory filter, then get the
> filenames
> > within that directory and apply remaining filters.
> >
> > I will create a JIRA for this enhancement but let me know your
> thoughts...
> >
> > Aman
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message