hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Gary Helmling (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-16570) Compute region locality in parallel at startup
Date Thu, 27 Oct 2016 20:39:58 GMT

    [ https://issues.apache.org/jira/browse/HBASE-16570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15613146#comment-15613146

Gary Helmling commented on HBASE-16570:

If I'm reading this correctly, this change totally circumvents the block location cache that
was added in HBASE-14473, and calls FileSystem.getFileBlockLocations() for every store file
every time the balancer runs.

In the LoadBalancer.balanceCluster() implementations (in StochasticLoadBalancer, SimpleLoadBalancer),
we create a new Cluster instance.

In Cluster.<init>, we call registerRegion() on every HRegionInfo.

In registerRegion(), we do the following:

Then, back in Cluster.<init> we do a get() on each ListenableFuture in a loop.

So while we are doing the calls to get block locations in parallel with 5 threads, it looks
like we're recomputing them every time balanceCluster() is called and not taking advantage
of the cache at all.  Am I misreading something here?  This seems to be a major performance
regression for clusters with large numbers of regions/store files.

> Compute region locality in parallel at startup
> ----------------------------------------------
>                 Key: HBASE-16570
>                 URL: https://issues.apache.org/jira/browse/HBASE-16570
>             Project: HBase
>          Issue Type: Sub-task
>            Reporter: binlijin
>            Assignee: binlijin
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>         Attachments: HBASE-16570-master_V1.patch, HBASE-16570-master_V2.patch, HBASE-16570-master_V3.patch,

This message was sent by Atlassian JIRA

View raw message