phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Tulasi P (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-1737) Provide APIs for creating Phoenix encoded rowkeys
Date Mon, 16 Mar 2015 22:03:38 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14364057#comment-14364057
] 

Tulasi P commented on PHOENIX-1737:
-----------------------------------

configureIncrementalLoad(...) runs the map-reduce job and it sets-up the reducer for sorting
(KeyValueSortReducer). Code snippet I provided in description runs inside the map(...) method
and output value of the mapper is a KeyValue object. I'll send-out the complete example shortly.
I've gz compression and fast_diff enconding on the hbase table, which should take care of
deduping. Do let me know if I'm missing something. 

> Provide APIs for creating Phoenix encoded rowkeys
> -------------------------------------------------
>
>                 Key: PHOENIX-1737
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1737
>             Project: Phoenix
>          Issue Type: Improvement
>            Reporter: Tulasi P
>
> Here is the code I used for direct Phoenix encoding of the composite rowkey. Bulk-loading
data with direct encoding can give upto 4x better performance compared to JDBC path used in
the default csv bulk-loader.
> Providing APIs for performing Phoenix encoding will be useful in such scenarios.
> {code}
> // rowkey is a 3 column (unsigned & fixed-size) composite key
> // 3 column qualifiers - q1, q2, q3
> ImmutableBytesWritable outputKey = new ImmutableBytesWritable();
> byte[] key1 = new byte[1];
> byte[] key2 = new byte[4];
> byte[] key3 = new byte[4];
> byte[] outKeyByteArr = new byte[1 + key1.length + key2.length + key3.length];       

> byte[] saltedKeyByteArr = new byte[outKeyByteArr.length];
> System.arraycopy(key1, 0, outKeyByteArr, 1, key1.length);
> System.arraycopy(key2, 0, outKeyByteArr, 1+key1.length, key2.length);
> System.arraycopy(key3, 0, outKeyByteArr, 1+key1.length+key2.length, key3.length);
> saltedKeyByteArr = SaltingUtil.getSaltedKey(new ImmutableBytesWritable(outKeyByteArr),
NUM_BUCKETS);
> outputKey.set(saltedKeyByteArr);
> kv = new KeyValue(outputKey.get(),"0".getBytes(), "q1".getBytes(), v1.getBytes());
> context.write(outputKey, kv);
> kv = new KeyValue(outputKey.get(),"0".getBytes(), "q2".getBytes(), v2.getBytes());
> context.write(outputKey, kv);
> kv = new KeyValue(outputKey.get(),"0".getBytes(), "q3".getBytes(), v3.getBytes());
> context.write(outputKey, kv);
> kv = new KeyValue(outputKey.get(),"0".getBytes(), "_0".getBytes());
> context.write(outputKey, kv);
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message