hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "John Sichi (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-1496) enhance CREATE INDEX to support immediate index build
Date Wed, 22 Sep 2010 19:40:34 GMT

    [ https://issues.apache.org/jira/browse/HIVE-1496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12913737#action_12913737
] 

John Sichi commented on HIVE-1496:
----------------------------------

The implementation for this will need to chain together a task to do the actual index building
together with a task to do the metastore update.  It should be similar to CREATE TABLE AS
SELECT (which both creates the table definition in the metastore and does the equivalent of
an INSERT to populate it with the SELECT results).

Use "EXPLAIN CREATE TABLE p AS SELECT * FROM pokes;" to see the combined plan.  And see the
end of SemanticAnalyzer.genMapRedTasks for where it chains the tasks together.

{noformat}
    if (qb.isCTAS()) {
      // generate a DDL task and make it a dependent task of the leaf
      ...
{noformat}

For immediate index build, we want to combine the existing CREATE INDEX with ALTER INDEX REBUILD.
 One hiccup may be that the rebuild already wants the index to be defined in the metastore,
whereas for CREATE TABLE AS SELECT we do it in the opposite order (only populating the metastore
after the data is successfully loaded).  It may be acceptable to just make the CREATE INDEX
non-atomic (i.e. populate the metastore first, and if the rebuild fails, we leave the index
empty; the user can retry with ALTER INDEX REBUILD, same as if it had been deferred in the
first place).

Ning Zhang (nzhang at facebook dot com) did the CREATE TABLE AS SELECT implementation, so
he may be able to provide help if you run into trouble with this one.


> enhance CREATE INDEX to support immediate index build
> -----------------------------------------------------
>
>                 Key: HIVE-1496
>                 URL: https://issues.apache.org/jira/browse/HIVE-1496
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Indexing
>    Affects Versions: 0.7.0
>            Reporter: John Sichi
>            Assignee: Russell Melick
>             Fix For: 0.7.0
>
>
> Currently we only support WITH DEFERRED REBUILD.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message