phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Resolved] (PHOENIX-1247) Join using sorted data
Date Fri, 05 Dec 2014 04:34:15 GMT

     [ https://issues.apache.org/jira/browse/PHOENIX-1247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

James Taylor resolved PHOENIX-1247.
-----------------------------------
    Resolution: Duplicate

Closing as duplicate of PHOENIX-1179, but please reopen a more specific JIRA if there's any
part of this issue not covered by that one.

> Join using sorted data
> ----------------------
>
>                 Key: PHOENIX-1247
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1247
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Brian Johnson
>
> Similar to pig merge join, Phoenix should have a join that takes advantage of the sorted
nature of hbase keys. If you have two tables that have a column which is sorted the same as
the rowkey, you can join them efficiently without keeping either table in RAM. This also depends
on using a split policy which ensures the keys will be in the same region like DelimitedKeyPrefixRegionSplitPolicy
> As an example, we keep user data in hbase where the first part of the key is the user
id and the second part makes it unique for each event. We then have a column which is just
the user id which will always be sorted because of the rowkey



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message