hive-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hive QA (JIRA)" <>
Subject [jira] [Commented] (HIVE-16757) Use memoization in HiveRelMdRowCount.getRowCount
Date Thu, 25 May 2017 14:30:04 GMT


Hive QA commented on HIVE-16757:

Here are the results of testing the latest attachment:

{color:red}ERROR:{color} -1 due to no test(s) being added or modified.

{color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10763 tests executed
*Failed tests:*
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_limit] (batchId=161)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[cbo_rp_limit] (batchId=154)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[explainuser_1] (batchId=151)
org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[subquery_scalar] (batchId=152)
org.apache.hadoop.hive.cli.TestSparkCliDriver.testCliDriver[cbo_limit] (batchId=138)

Test results:
Console output:
Test logs:

Executing org.apache.hive.ptest.execution.TestCheckPhase
Executing org.apache.hive.ptest.execution.PrepPhase
Executing org.apache.hive.ptest.execution.ExecutionPhase
Executing org.apache.hive.ptest.execution.ReportingPhase
Tests exited with: TestsFailedException: 5 tests failed

This message is automatically generated.

ATTACHMENT ID: 12869865 - PreCommit-HIVE-Build

> Use memoization in HiveRelMdRowCount.getRowCount
> ------------------------------------------------
>                 Key: HIVE-16757
>                 URL:
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Planning
>            Reporter: Remus Rusanu
>            Assignee: Remus Rusanu
>         Attachments: HIVE-16757.01.patch
> On complex queries HiveRelMdRowCount.getRowCount can get called many times. since it
does not memoize its result and the call is recursive, it results in an explosion of calls.
for example a query with 49 joins, during join ordering (LoptOtimizerJoinRule) the HiveRelMdRowCount.getRowCount
gets called 6442 as a top level call, but the recursivity exploded this to 501729 calls. Memoization
of the rezult would stop the recursion early. In my testing this reduced the join reordering
time for said query from 11s to <1s..

This message was sent by Atlassian JIRA

View raw message