impala-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Tauber-Marshall (Code Review)" <>
Subject [Impala-CR](cdh5-trunk) IMPALA-2805: Order filters based on selectivity and cost
Date Tue, 22 Mar 2016 00:18:41 GMT
Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-2805: Order filters based on selectivity and cost

Patch Set 1:

For now, we're looking at the performance testing from two angles:

- Updates to the ordering of filters in the existing PlannerTests.

For many of these test file updates, you can manually inspect them and see that the ordering
change makes sense (eg. putting integer comparisons before string comparisons).

For some, the reason for the reordering is not so obvious, usually because some predicates
that seem like they should come first are missing selectivity estimates and so end up at the
end. I generated a diff of all of the changes to the .test files that is annotated with cost
and selectivity values to make this easier to see:

- Manually run tests.

A realistic query:
         l_comment like '%a%b%s%' and l_orderkey = 19;
which took ~2.06s on cdh5-trunk and ~1.65s with this change

And a contrived query that shows the effect well:
    repeat(string_col, 1000) like repeat(string_col, 1000) and id = -1;
which took ~3.14s on cdh5-trunk and ~0.22s with these changes.

To view, visit
To unsubscribe, visit

Gerrit-MessageType: comment
Gerrit-Change-Id: I02279a26fbc6308ac5eb819d78345fc010469034
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Thomas Tauber-Marshall <>
Gerrit-Reviewer: Thomas Tauber-Marshall <>
Gerrit-HasComments: No

View raw message