hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ning Zhang (JIRA)" <>
Subject [jira] Commented: (HIVE-924) Extract LogicalPlan and PhysicalPlan classes from SemanticAnalysis class
Date Wed, 02 Dec 2009 08:06:20 GMT


Ning Zhang commented on HIVE-924:

I like the idea of refactoring the LogicalPlan and PhysicalPlan in general, but since it is
too large, can you separate it into several small JIRAs. e.g., 1 JIRA for PhysicalPlan and
PhysicalPlan generator and 1 JIRA for LogicalPlan and its generator, and 1 for the rest of
misc changes. It is more managable to review and have less chance of conflicting with other

Some detailed comments:

1) the transform() function now returns void instead of ParseContext/LogicalPlan. This may
be OK in the current implementation, but in general if we want to extend the rule-based rewrite
system to cost based one, we probably need transform return a different LogicalPlan (the rewritten
one) and keep the input plan unchanged. Then we can cost them. The optimize() function is
similar, it should return the best plan given an input plan in the cost-based rewrite framework.

2) I noticed that you removed some code and had a comment like "possible bug". Can you come
up with some unit tests for such cases that the old code cause bugs? I've seen some diff in
the unit test results under this patch, so I'm wondering whether they are caused by these

> Extract LogicalPlan and PhysicalPlan classes from SemanticAnalysis class
> ------------------------------------------------------------------------
>                 Key: HIVE-924
>                 URL:
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Carl Steinbach
>         Attachments: HIVE-924.patch
> Currently the SemanticAnalyzer class handles semantic analysis, as well as logical plan
generation and physical plan generation. I think it would be beneficial to extract distinct
LogicalPlan and PhysicalPlan classes from the SemanticAnalyzer, and have the query processing
phase be coordinated by a QueryCompiler class that would be responsible for triggering the
parsing, semantic analysis, logical plan generation, optimization, and physical plan generation
phases. This proposed reorganization of components would help to isolate the state of each
phase, and would also bring the source into closer alignment with the description of the query
compiler in the Hive design document on the wiki.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message