phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <>
Subject [jira] [Resolved] (PHOENIX-1247) Join using sorted data
Date Fri, 05 Dec 2014 04:34:15 GMT


James Taylor resolved PHOENIX-1247.
    Resolution: Duplicate

Closing as duplicate of PHOENIX-1179, but please reopen a more specific JIRA if there's any
part of this issue not covered by that one.

> Join using sorted data
> ----------------------
>                 Key: PHOENIX-1247
>                 URL:
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: Brian Johnson
> Similar to pig merge join, Phoenix should have a join that takes advantage of the sorted
nature of hbase keys. If you have two tables that have a column which is sorted the same as
the rowkey, you can join them efficiently without keeping either table in RAM. This also depends
on using a split policy which ensures the keys will be in the same region like DelimitedKeyPrefixRegionSplitPolicy
> As an example, we keep user data in hbase where the first part of the key is the user
id and the second part makes it unique for each event. We then have a column which is just
the user id which will always be sorted because of the rowkey

This message was sent by Atlassian JIRA

View raw message