hadoop-hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joydeep Sen Sarma (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HIVE-417) Implement Indexing in Hive
Date Mon, 01 Jun 2009 17:17:07 GMT

    [ https://issues.apache.org/jira/browse/HIVE-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12715153#action_12715153

Joydeep Sen Sarma commented on HIVE-417:

- are we going to have one index file per hdfs file? (or one per partition?)

related question is how this is going to interact with sampling? (i think currently the sampling
predicate is optimized out for bucketed tables - although not terribly sure).

i would love to see the api to invoke the index. 
- ideally we would like to plug in different indexing schemes - as well with map-side joins
- the hashmap storing the smaller table can be seen as an index on this table. It would seem
that one should be able to replace a map-side join based on tables loaded into jdbm with tables
with indices proposed here (and thereby do joins based on indices almost trivially). 
- we should enable people to be able to plug in their own indices (since it's quite likely
that over time there will be multiple indexing efforts on hadoop files).

> Implement Indexing in Hive
> --------------------------
>                 Key: HIVE-417
>                 URL: https://issues.apache.org/jira/browse/HIVE-417
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Metastore, Query Processor
>    Affects Versions: 0.2.0, 0.3.0, 0.3.1, 0.4.0
>            Reporter: Prasad Chakka
>            Assignee: He Yongqiang
>         Attachments: hive-417.proto.patch
> Implement indexing on Hive so that lookup and range queries are efficient.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message