phoenix-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Taylor (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (PHOENIX-3583) Prepare IndexMaintainer on server itself
Date Thu, 19 Jan 2017 23:14:26 GMT

    [ https://issues.apache.org/jira/browse/PHOENIX-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15830800#comment-15830800
] 

James Taylor commented on PHOENIX-3583:
---------------------------------------

Thanks for the explanation, [~elserj]. Now I understand what you're getting at. There's a
small bit of code that decides whether to tack on the IndexMaintainer to the mutations themselves
(as an attribute) or make a separate, single RPC per region server to cache them for usage
when the mutations are processed:
{code}
    public static boolean useIndexMetadataCache(PhoenixConnection connection, List<? extends
Mutation> mutations, int indexMetaDataByteLength) {
        ReadOnlyProps props = connection.getQueryServices().getProps();
        int threshold = props.getInt(INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB, QueryServicesOptions.DEFAULT_INDEX_MUTATE_BATCH_SIZE_THRESHOLD);
        return (indexMetaDataByteLength > ServerCacheClient.UUID_LENGTH && mutations.size()
> threshold);
    }
{code}
So the value of INDEX_MUTATE_BATCH_SIZE_THRESHOLD_ATTRIB determines the number of rows above
which a separate RPC is made. The default is only 3 rows. Perhaps we should bump that up substantially
if the RPCs are becoming a bottleneck? It would have the affect of making the payload larger
(by numRowsInBatchToRS * sizeofIndexMaintainer). Unfortunately, there's no mechanism in HBase
to add an attribute only to the RPC to the RS as opposed to having to repeat it on every mutation
(HBASE-9291).

> Prepare IndexMaintainer on server itself
> ----------------------------------------
>
>                 Key: PHOENIX-3583
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3583
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Ankit Singhal
>            Assignee: Ankit Singhal
>         Attachments: PHOENIX-3583.patch
>
>
> -- reuse the cache of PTable and it's lifecycle.
> -- With the new implementation, we will be doing RPC to meta table per mini batch which
could be an overhead, but the same configuration "updateCacheFrequency" can be used to control
a frequency of touching SYSTEM.CATALOG endpoint for updated Ptable or index maintainers. 
> -- It is expected that 99% of the time the table is old and RPC will be returned with
an empty result(so it may be less costly), as opposed to the current implementation where
we have to send the index maintainer payload to each region server per upsert batch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message