hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nick Dimiduk <ndimi...@apache.org>
Subject [DISCUSS] Normalizer and pre-split tables
Date Fri, 26 Jun 2020 21:30:03 GMT
Heya,

I've seen a lot of use-cases where the normalizer would be a nice solution
for operators and application developers. I've been trying to beef it up a
bit to handle these cases. However, some of these considerations are at
odds, so I want to vet the ideas here.

The normalizer is a background chore in the HMaster that attempts to
converge region sizes within a table toward the average region size. It has
a pretty wide error bar, but that's the overall goal.

Early on, it was observed that an operator needs to pre-split a table, so
special considerations were included, by way of
`hbase.normalizer.min.region.count`,
`hbase.normalizer.merge.min_region_age.days`, and
`hbase.normalizer.merge.min_region_size.mb`. All these nobs are designed to
give an operator means of controlling this behavior.

We have (what I see as) a competing objective: doing away with empty, or
nearly-empty regions. The use-case is pretty common when there's a TTL
applied to a table, especially if there's also a timestamp component in the
rowkey. In this case, we want the normalizer to "merge away" these empty
regions.

The trouble is we ship defaults for all of the `*min*` configs, and right
now there's no way to "unset" them, disable the functionality. Which means
there still isn't a way to support the empty regions use-case without
awkward special-case checks. This is where I'm looking for suggestions from
the community. There's some discussion under way over on the PR for
HBASE-24583. Please take a look.

Thanks in advance,
Nick

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message