hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-12117) Constructors that use Configuration may be harmful
Date Tue, 30 Sep 2014 15:58:34 GMT

    [ https://issues.apache.org/jira/browse/HBASE-12117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14153312#comment-14153312

Andrew Purtell commented on HBASE-12117:

Yeah I don't think HTable is as lightweight as we want because if an app manages its own HConnection
and creates an HTable for each interaction - as we recommend - then it can pay this unexpected
cost as much as ~20% of CPU time. This is one example where using Configuration to set up
an object is expensive. Found this when looking at something else. 

Yes I think we could cache configuration in Connection. We are using it like a factory for
HTable. Object factories would one way to address this (anti?)pattern wherever it's costly.

Related, we should also create by reflection once and cache the desired RpcController object,
and clone it for new HTables for the Connection.  

> Constructors that use Configuration may be harmful
> --------------------------------------------------
>                 Key: HBASE-12117
>                 URL: https://issues.apache.org/jira/browse/HBASE-12117
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Andrew Purtell
>         Attachments: traces.client.c.svg, traces.client.getHTable.svg
> There's a common pattern in HBase code where in the constructor, or in an initialization
method also called once per instantiation, or both, we look up values from Hadoop Configuration
and store them into fields. This can be expensive if the object is frequently created. Configuration
is a heavyweight registry that does a lot of string operations and regex matching. See attached
example. Method calls into Configuration account for 48.25% of CPU time when creating the
HTable object in 0.98. (The remainder is spent instantiating the RPC controller via reflection,
a separate issue that merits followup elsewhere.) Creation of HTable instances is expected
to be a lightweight operation if a client is using unmanaged HConnections; however creating
HTable instances takes up about 18% of the client's total on-CPU time. This is just one example
where constructors that use Configuration may be harmful.

This message was sent by Atlassian JIRA

View raw message