hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dejan Menges <dejan.men...@gmail.com>
Subject Automating major compactions
Date Wed, 08 Jul 2015 13:03:21 GMT

What's the best way to automate major compactions without enabling it
during off peak period?

What I was testing is simple script which runs on every node in cluster,
checks if there is major compaction already running on that node, if not
picks one region for compaction and run compaction on that one region.

It's running for some time and it helped us get our data to much better
shape, but now I'm not quite sure how to choose anymore which region to
compact. So far I was reading for that node rs-status#regionStoreStats and
first choosing the one with biggest amount of storefiles, and then those
with biggest storefile sizes.

Is there maybe something more intelligent I could/should do?

Thanks a lot!

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message