drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mehant Baid <baid.meh...@gmail.com>
Subject Re: Moving directory based pruning to fire earlier
Date Mon, 23 Nov 2015 21:57:33 GMT
Currently all rules based on Calcite logical rels and Drill logical rels 
are put together and are fired together. As part of DRILL-3996, Jinfeng 
will break it down into different phases. I should be able to take 
advantage of this and move the directory based partition pruning to fire 
based on Calcite rels.


On 11/23/15 10:58 AM, Hanifi GUNES wrote:
> The general idea of multi-phase pruning makes sense to me. I am wondering,
> though, are we referring to introducing a new planning phase before the
> logical or separating out the logic so as to make directory pruning kick
> off ahead of column partitioning?
> 2015-11-23 10:33 GMT-08:00 Mehant Baid <baid.mehant@gmail.com>:
>> As part of DRILL-3996 <https://issues.apache.org/jira/browse/DRILL-3996>
>> Jinfeng mentioned that he plans to move the directory based pruning rule
>> earlier than column based pruning. I want to expand on that a little,
>> provide the motivation and gather thoughts/ feedback.
>> Currently both the directory based pruning and the column based pruning is
>> fired in the same planning phase and are based on Drill logical rels. This
>> is not optimal in the case where data is organized in such a way that both
>> directory based pruning and column based pruning can be applied (when the
>> data is organized with a nested directory structure plus the individual
>> files contain partition columns). As part of creating the Drill logical
>> scan we read the footers of all the files involved. If the directory based
>> pruning rule is fired earlier (rule to fire based on calcite logical rels)
>> then we will be able to prune out unnecessary directories and save the work
>> of reading the footers of these files.
>> Thanks
>> Mehant

View raw message