drill-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jacques Nadeau <jacq...@dremio.com>
Subject Re: Directory and file based partition pruning
Date Fri, 11 Sep 2015 13:44:37 GMT
I'm guessing that the issue is Metadata reading in that case. We've seen
the problem before if you reading hundreds of thousands files since Parquet
Metadata is fairly large.

Given the multiple firings, do we know the time for a single completion. It
seems strange that partition operation in interpreted mode even with
100,000 files would take very long. If it does, I'm wondering if anyone
looked at a profiler to see where the time is spent.
On Sep 10, 2015 8:31 PM, "Jinfeng Ni" <jinfengni99@gmail.com> wrote:

> I got the impression of Java heap memory because one customer
> complained about running into out of heap memory, when they are
> dealing with pruning large number of files. Is it possible that the rule
> put the value vector in the direct memory, but also uses object reference
> which is proportional to the # of files. That might explain why they
> run into out of heap memory.
>
>
>
> On Thu, Sep 10, 2015 at 6:25 PM, Aman Sinha <asinha@maprtech.com> wrote:
> > Yes, it is a good point about multiple invocations of the PruneScan rule.
> > The other point about using Java heap is not correct.  The rule does
> > off-heap allocation using memory buffer from QueryContext and in the
> > finally block releases the memory.
> >
> > Aman
> >
> > On Thu, Sep 10, 2015 at 6:18 PM, Jinfeng Ni <jinfengni99@gmail.com>
> wrote:
> >
> >> I opened DRILL-3765 for the multiple rule execution issue:
> >>
> >> https://issues.apache.org/jira/browse/DRILL-3765
> >>
> >>
> >> On Thu, Sep 10, 2015 at 5:34 PM, Jinfeng Ni <jinfengni99@gmail.com>
> wrote:
> >> > Seems to me one important reason we hit out of heap memory for
> partition
> >> > prune rule is that the rule itself is invoked multiple times, even the
> >> > filter has been pushed into scan in the first call.
> >> >
> >> > I tried with a simple unit test
> >> > TestPartitionFilter:testPartitionFilter1_Parquet_from_CTAS(), here is
> >> the #
> >> > of frequency of partition rules that are fired in Calcite trace
> >> >
> >> >  #_rule_fire,  rule name
> >> >
> >> >  4 [PruneScanRule:Filter_On_Project_Parquet]
> >> >  4 [PruneScanRule:Filter_On_Project]
> >> >
> >> >  2 [PruneScanRule:Filter_On_Scan_Parquet]
> >> >  2 [PruneScanRule:Filter_On_Scan]
> >> >
> >> > Setting a breaking point in PruneScanRule where it calls the
> interpreter
> >> to
> >> > evaluate the expression, I could see that the code stops 6 times in
> that
> >> > point; meaning that Drill will have to build the vector containing the
> >> > filenames at least 6 times.  That would cause lots of heap memory
> >> > consumption, if gc does not kick in to release the memory used in the
> >> prior
> >> > rule's execution.
> >> >
> >> > I think making the partition pruning multiple phases will help to
> reduce
> >> the
> >> > memory consumption. But for now, it seems important to avoid the
> repeated
> >> > and unnecessary rule execution.
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > On Thu, Sep 10, 2015 at 4:42 PM, Aman Sinha <asinha@maprtech.com>
> wrote:
> >> >>
> >> >> Agree on the N phased approach.  I have filed a JIRA for the
> >> enhancement:
> >> >>  DRILL-3759.
> >> >> Regarding the simplification of the expression tree logic..did you
> mean
> >> >> the
> >> >> logic in FindPartitionConditions  or the Interpreter ?
> >> >> Perhaps you can add comments in the JIRA with some explanation.  I
> am in
> >> >> favor of simplification where possible.
> >> >>
> >> >> On Wed, Sep 9, 2015 at 10:39 PM, Jacques Nadeau <jacques@dremio.com>
> >> >> wrote:
> >> >>
> >> >> > Makes sense.
> >> >> >
> >> >> > Is there we can do this with lazy materializations rather than
> writing
> >> >> > complex expression tree logic? I hate have no all this custom
> >> expression
> >> >> > tree manipulation logic.
> >> >> >
> >> >> > Also, it seems like this should be N phased rather than two phase
> >> where
> >> >> > N
> >> >> > is the number of directories below the base path.
> >> >> >
> >> >> > Thoughts?
> >> >> > On Sep 9, 2015 10:54 AM, "Aman Sinha" <amansinha@apache.org>
> wrote:
> >> >> >
> >> >> > > Currently, partition pruning gets all file names in the table
and
> >> >> > > applies
> >> >> > > the pruning.  Suppose the files are spread out over several
> >> >> > > directories
> >> >> > and
> >> >> > > there is a filter  on dirN,  this is not efficient - both
in
> terms
> >> of
> >> >> > > elapsed time and memory usage.  This has been seen in a few
use
> >> cases
> >> >> > > recently.
> >> >> > >
> >> >> > > We should ideally perform the pruning in 2 steps:  first
get the
> >> >> > top-level
> >> >> > > directory names only and apply the directory filter, then
get the
> >> >> > filenames
> >> >> > > within that directory and apply remaining filters.
> >> >> > >
> >> >> > > I will create a JIRA for this enhancement but let me know
your
> >> >> > thoughts...
> >> >> > >
> >> >> > > Aman
> >> >> > >
> >> >> >
> >> >
> >> >
> >>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message