hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "He Yongqiang (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-417) Implement Indexing in Hive
Date Fri, 16 Jul 2010 22:45:54 GMT

    [ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12889376#action_12889376
] 

He Yongqiang commented on HIVE-417:
-----------------------------------

THANKS FOR THE DETAILED COMMENTS.

>>We should support a property on the index which controls the name of the index table,
and only generate an index table name automatically in the case where the user doesn't supply
the property. 
will add this in the following patch.

>>For this, we'll need to add property key/values to the grammar (IDXPROPERTIES like
TBLPROPERTIES and SERDEPROPERTIES?).
Let's do it in a followup jira.

>>The grammar supports control over the tableFileFormat for the index table; what about
other attributes such as row format, location, and TBLPROPERTIES? Some of these may be dictated
by the index implementation, but it may be useful to override in some cases (same as tableFileFormat).
We can add this when we see the requirement. For now we can leave this out.

>>I think we should track the status of the index (when was the last time it was rebuilt,
if ever) so that we know whether it is fresh with respect to the base table data. How should
we model this in such a way that it takes per-partition indexing into account?
I think it's the same as the one of key/value property. no?

>>Test queries: remember to add ORDER BY for determinism. 
will add this in the following patch.

>>Also, I'm not sure whether it is safe to use /tmp in the local file system (it may
not exist, e.g. on Windows). I used it in hbase_bulk.m, but that uses a mini HDFS cluster
(not the local file system).
I think it's should be ok because it's not local tmp. it's mini HDFS /tmp

>>Dropping a table with an index on it currently gives the exception below (in Derby;
I didn't test MySQL yet). Same for attempting to drop an index table directly (instead of
dropping the index). The second case should either fail with a meaningful exception, or implicitly
drop the index definition as a trigger from dropping the table.
Actually this is reported by Prafulla offline. Will add this in the following patch. For the
second case, i am planning to report error.

> Implement Indexing in Hive
> --------------------------
>
>                 Key: HIVE-417
>                 URL: https://issues.apache.org/jira/browse/HIVE-417
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>    Affects Versions: 0.3.0, 0.3.1, 0.4.0, 0.6.0
>            Reporter: Prasad Chakka
>            Assignee: He Yongqiang
>         Attachments: hive-417.proto.patch, hive-417-2009-07-18.patch, hive-indexing-8-thrift-metastore-remodel.patch,
hive-indexing.3.patch, hive-indexing.5.thrift.patch, idx2.png, indexing_with_ql_rewrites_trunk_953221.patch
>
>
> Implement indexing on Hive so that lookup and range queries are efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message