hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashish Thusoo (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HIVE-186) Refactor code to use a single graph, nodeprocessor, dispatcher and rule abstraction
Date Thu, 18 Dec 2008 02:36:46 GMT

     [ https://issues.apache.org/jira/browse/HIVE-186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashish Thusoo updated HIVE-186:
-------------------------------

    Attachment: patch-186.txt

This patch contains the cleanup and refactoring of all the graph walking and rules framework.
The unified framework is in the package

org.apache.hadoop.hive.ql.lib

Node is the interface that must be implemented by the graph in order to use the graph walkers
and rule dispatchers available within this framework. There are two implementations of this
interface currently -

1. ASTNode - in ql.parse that is a wrapper around the CommonTree classes of the antlr runtime.
2. Operator - in ql.exec that implements the operator tree nodes

I have also removed the DefaultDispatcher implementation of the Dispatcher. This functionality
can be equivalently expressed using DefaultRuleDispatcher. Accordingly I have cleaned out
the GenMR* processors and the ColumnPruner to reflect these changes. ColumnPruner is also
split into ColumnPrunerProcFactory to create the processors for the various rules needed therein
and ColumnPrunerProcCtx which is used to carry the context information (this class is an implementation
of NodeProcessorCtx) between rules.

I have gotten rid of all the classes related to the ASTs (ASTEvent, ASTDispatcher, ASTProcessor,
ASTEventProcessor etc...)

The Node interfaces are processed by implementations of NodeProcessor. I have removed the
reflection bases invocation that we were doing in the earlier DefaultDispatcher and DefaultRuleDispatcher.
Now only a single process function is called and the user has to implement a different processors
for different rules (see ColumnPrunerProcFactory).

The walker interface has been renamed to GraphWalker and the default implementation is now
callled DefaultGraphWalker. Also I have eliminated the TopoWalker. DefaultGraphWalker is now
not an abstract class so that clients can use it right out of the box. The ColumnPrunerWalker
and the GenMapRedWalker are still subclasses of the DefaultGraphWalker.


> Refactor code to use a single graph, nodeprocessor, dispatcher and rule abstraction
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-186
>                 URL: https://issues.apache.org/jira/browse/HIVE-186
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Ashish Thusoo
>            Assignee: Ashish Thusoo
>         Attachments: patch-186.txt
>
>
> Currently, the query processor has two different tree and rule abstractions - one for
ASTs and one for Operator Graphs. We should clean this up so that we have a single abstraction
that can be reused at different stages in the query compiler.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message