hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Swarnim Kulkarni (JIRA)" <>
Subject [jira] [Commented] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter
Date Mon, 19 May 2014 18:42:39 GMT


Swarnim Kulkarni commented on HIVE-7048:

I guess I might have messed up on my explanation a little bit. :) What I meant was specific
to PrefixFilter. So for instance if we have a row key "A-B-C" custom serialized by someone.
This key represented in hive as struct<a:string,b:string,c:string>. With a query as
key.a="A", the value "A" gets pushdown. Now to use the PrefixFilter, if we convert it to bytes

scanRange.setFilter(new PrefixFilter(getBytes(value)))

my concern was that we might not match anything because of the difference in serialization.
Instead I felt that since the consumer is originally serializing the key, it's probably best
to push this down to the consumer as well to serialize the pushdown value, convert it to filter
and apply to the scan. Be default we just keep it as a no-op.

> CompositeKeyHBaseFactory should not use FamilyFilter
> ----------------------------------------------------
>                 Key: HIVE-7048
>                 URL:
>             Project: Hive
>          Issue Type: Improvement
>          Components: HBase Handler
>            Reporter: Swarnim Kulkarni
>            Assignee: Swarnim Kulkarni
>            Priority: Blocker
>         Attachments: HIVE-7048.1.patch.txt
> HIVE-6411 introduced a more generic way to provide composite key implementations via
custom factory implementations. However it seems like the CompositeHBaseKeyFactory implementation
uses a FamilyFilter for row key scans which doesn't seem appropriate. This should be investigated
further and if possible replaced with a RowRangeScanFilter.

This message was sent by Atlassian JIRA

View raw message