drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-4857) When no partition pruning occurs with metadata caching there's a performance regression
Date Tue, 23 Aug 2016 16:10:21 GMT

    [ https://issues.apache.org/jira/browse/DRILL-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15433110#comment-15433110
] 

ASF GitHub Bot commented on DRILL-4857:
---------------------------------------

GitHub user amansinha100 opened a pull request:

    https://github.com/apache/drill/pull/575

    DRILL-4857: Maintain pruning status and populate ParquetGroupScan's e…

    …ntries field with only the selection root if no partition pruning was done.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/amansinha100/incubator-drill DRILL-4857-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/575.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #575
    
----
commit 67c8f48a49a880740bc8d147e90818bb20da6455
Author: Aman Sinha <asinha@maprtech.com>
Date:   2016-08-23T02:03:13Z

    DRILL-4857: Maintain pruning status and populate ParquetGroupScan's entries field with
only the selection root if no partition pruning was done.

----


> When no partition pruning occurs with metadata caching there's a performance regression
> ---------------------------------------------------------------------------------------
>
>                 Key: DRILL-4857
>                 URL: https://issues.apache.org/jira/browse/DRILL-4857
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata, Query Planning & Optimization
>    Affects Versions: 1.7.0
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
>             Fix For: 1.8.0
>
>
> After DRILL-4530, we see the (expected) performance improvements in planning time with
metadata cache for cases where partition pruning got applied.  However, in cases where it
did not get applied and for sufficiently large number of files (tested with up to 400K files),
 there's performance regression.  Part of this was addressed by DRILL-4846.   This JIRA is
to track some remaining fixes to address the regression.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message