drill-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aman Sinha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DRILL-3929) Support the ability to query database tables using external indices
Date Thu, 15 Oct 2015 16:48:06 GMT

    [ https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959212#comment-14959212

Aman Sinha commented on DRILL-3929:

 > As a side point on this, I also think we need to fix the HBase pushdown so it behaves
more like the JDBC plugin
Yes, avoiding the checks to determine the multimode patterns is not ideal..
[~jnadeau] you want to create a JIRA for it ? 

Regarding the Phoenix approach, there are a few considerations: 
(1) Is Phoenix registering an alternative physical plan or alternative SQL ? I think it is
the latter (SQL).  There are pros and cons: 
 (a) covering index (all cols are available in the index) : The SQL
 approach could work since the originalquery 'SELECT * FROM T 
 WHERE index_col < 10'  can be rewritten to use the index only.  
 (b) the general case of non-covering index.  For such cases, we may 
 be only retrieving the rowid/rowkey from the index, we have to join back 
 to the original table to retrieve rest of the columns. This should ideally be 
 done at the physical planning level rather than trying to express such
 semantics in SQL. 
 (c) Doing something at the SQL or even at the logical planning level 
   means that the search space will increase due to treating a materialized 
   view/index as a separate table, putting it in the same equivalence class
   as the original table. 
(2) Costing and statistics: 
   (a) Index lookups have a random I/O pattern compared to table scans,
    so they must be costed differently.  I am not sure how to even model
    the cost of external secondary indexes such as Elastic or Lucene. 
    Phoenix secondary indexes for Hbase are more 'native' so they could 
    have a decent cost model.   
   (b) In order to generate an Index scan plan, I would think Phoenix might 
    rely on filter selectivity estimates. However, this statistic is not always
    available and non-trivial to compute for complex predicates. 

If you have thoughts about these, let me know.  I would like to 
understand the Phoenix approach a little better...perhaps a google 
 hangout would help.

> Support the ability to query database tables using external indices           
> ------------------------------------------------------------------------------
>                 Key: DRILL-3929
>                 URL: https://issues.apache.org/jira/browse/DRILL-3929
>             Project: Apache Drill
>          Issue Type: New Feature
>          Components: Execution - Relational Operators, Query Planning & Optimization
>            Reporter: Aman Sinha
>            Assignee: Aman Sinha
> This is a placeholder for adding support in Drill to query database tables using external
indices.  I will add more details about the use case and a preliminary design proposal.  

This message was sent by Atlassian JIRA

View raw message