hive-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Swarnim Kulkarni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HIVE-7048) CompositeKeyHBaseFactory should not use FamilyFilter
Date Tue, 20 May 2014 22:59:39 GMT

    [ https://issues.apache.org/jira/browse/HIVE-7048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14004098#comment-14004098
] 

Swarnim Kulkarni commented on HIVE-7048:
----------------------------------------

>  I don't understand why CompositeHBaseKeyFactory is pushing down predicates at all.

Anything important that needs to be done on the key like setting a filter or a scan range
would need access to the predicates from the query and so I think a pushdown here is necessary.
Also like we discussed before, in order to appropriately consume the pushdown values it is
very important to maintain the order of pushdown same as the values of the struct.

> why it provides a validator that asserts that only the first field can be pushed down.

Agree that this is not required. My only reason behind it was to add in a preliminary support
for pushdown of structs and then later update to a full fledged support of pushing down multiple
fields. Once that support is baked in, this validator can be removed. Until this is done,
the validator will make sure that wrong predicates do not get pushdown.

> If it's doing this while not setting up corresponding filter, then the result will not
be correct.

Good point. Since the results won't be correct without proper handling of these and we do
not know how to handle this properly on our end to provide a good default, I propose that
we debar the CompositeKeyHBaseFactory off its "concrete class" status and add in a abstract
method something like

{code}
public abstract HBaseScanRange setupFilter(IndexSearchCondition condition) throws Exception;
{code}

Consumers can then use the pushed down IndexSearchCondition accordingly to set a scan and/or
filter on the scan range. 

Thoughts?

> CompositeKeyHBaseFactory should not use FamilyFilter
> ----------------------------------------------------
>
>                 Key: HIVE-7048
>                 URL: https://issues.apache.org/jira/browse/HIVE-7048
>             Project: Hive
>          Issue Type: Improvement
>          Components: HBase Handler
>            Reporter: Swarnim Kulkarni
>            Assignee: Swarnim Kulkarni
>            Priority: Blocker
>         Attachments: HIVE-7048.1.patch.txt
>
>
> HIVE-6411 introduced a more generic way to provide composite key implementations via
custom factory implementations. However it seems like the CompositeHBaseKeyFactory implementation
uses a FamilyFilter for row key scans which doesn't seem appropriate. This should be investigated
further and if possible replaced with a RowRangeScanFilter.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message