hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (HBASE-3342) Server-side Row-level Inverted Index Join via Coprocessors
Date Mon, 29 Dec 2014 19:45:14 GMT

     [ https://issues.apache.org/jira/browse/HBASE-3342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

stack resolved HBASE-3342.
    Resolution: Later

Nice idea but no movement in years. Resolving as later.

> Server-side Row-level Inverted Index Join via Coprocessors
> ----------------------------------------------------------
>                 Key: HBASE-3342
>                 URL: https://issues.apache.org/jira/browse/HBASE-3342
>             Project: HBase
>          Issue Type: New Feature
>          Components: Coprocessors
>            Reporter: Jonathan Gray
> A common schema in HBase is to created an inverted index per row (a la inbox search)
where a row is a user/entity, each column is a word, and versions are instances of that word
in documents (values can be empty or could contain additional scoring info like position /
count information).
> When querying indexes like this, we may want to do something like:  give me the N most
recent documents that contain the word "foo" (exact word matching) and contain a word that
starts with "bar" (prefix matching).
> Currently this join has to be done on the client-side, so we may have to read far more
than N documents for each word to be able to get N documents which match for both words. 
This gets worse as the number of words increase.
> We could implement this join on the server-side in a coprocessor.

This message was sent by Atlassian JIRA

View raw message