accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Fagan, Michael" <Michael_Fa...@comcast.com>
Subject Re: Implementing Index Table for Accumulo Hive Queries
Date Mon, 09 Jan 2017 19:59:35 GMT
Josh,

Thanks, it looks like If I can override the getRanges() from the AccumuloPredicateHandler
I might be able to build correct ranges based on matching index rows.
Does this sound feasible? 

Regards,
Mike Fagan

On 1/9/17, 12:38 PM, "Josh Elser" <josh.elser@gmail.com> wrote:

    Hi Mike,
    
    As far as I understand it, the Hive storage handler APIs (which is how 
    the Accumulo integration is implemented) doesn't expose any ability to 
    do use index tables to answer some query.
    
    This means that the only thing you can do to make queries faster, would 
    be to create a number of tables, pivoted on the columns you care about, 
    putting the important columns in the rowId. Then, you would have to know 
    which table to use at the application layer.
    
    Admittedly, this is pretty lacking. I'd have to go look at the Hive 
    community to see if this is something that's been built there.
    
    - Josh
    
    Fagan, Michael wrote:
    > Hi,
    >
    > I am looking to utilize an index table to avoid full table scans and speed up hive
queries against an external accumulo table.
    >
    > Has anyone done this yet? Can someone point me in the right direction?
    >
    > Regards,
    > Mike Fagan
    >
    
    

Mime
View raw message