hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dave Wang <...@cloudera.com>
Subject Stream-of-consciousness notes from HBase BoF presentations
Date Fri, 15 Jun 2012 19:07:47 GMT
Please fill out if I missed anything, thanks.

Schema thoughts - Ian
* Build typed schema into HBase somehow?
** Easier for app developers
* Idea of hashing schema into each row, perhaps have a system-level table
for schema descriptions

Performance tidbits - JD (slides at

Write path:
* Too much write contention (flattening out)
** increase memstore size
** rely only on memstore lower limit, but don't ever hit upper limit if
** HBASE-3484
* Memstore vs. HLog size - make sure hlogs are not forcing flushes
** ideal is hlog.blocksize * maxlogs == just a bit above memstore.lowerLimit
** New balancer in trunk has balancer with weighted read vs. write requests
* Too many regions/families - don't hit the global memstore size
** means global memstore size will already be reached before flush size, so
many small files being flushed (bad)
* write to many families with different data sizes
** flush size is based on region, not family
** HBASE-3149 about fixing this.
* HLog compression (4608)
** breaks replication (5778)
** benefits with keys to values ratio is high (counters), since it doesn't
compress the values

Read path:
* current LRU is sometimes better, sometimes worse.  Other algorithms may
beat this (e.g. 2Q) in some cases.
* make sure working data set fits in block cache of course
* evictions start happening at 85% (DEFAULT_ACCEPTABLE_FACTOR) and evict
down to 75% (DEFAULT_MIN_FACTOR).  Too aggressive.  Try 95% and 90%.
* disable blockcache if you have a highly random read pattern.  disable per
family, not throughout RS, since meta blocks (leaf, bloom) are really hot
and you want BC on for those.
* slabcache - double-caches everything, not flexible with regards to block
sizes, doesn't help that much with GC since by default BC is 25% of heap by
default.  Most heap is used by memstores.
* GC issues tend to be caused by memstores and IPC queues (default 1000
entries, lots of data sitting in there not processed yet sometimes)
See Charlie Hunt's book re: GC walkthrough.  There's 20% of so of the heap
just for HBase internals (compactions, etc.) BTW.
** GC topic is tough for general prescription, since JVMs change, garbage
collectors change.
* Overall heap size: 20-32 GB.
* Turn on bloom filters (row) in 0.92+.  Should be on by default.

Metrics and UI - Elliott - slides at
* per-region metrics in 0.94
** on by default but need to turn on NullContextWithUpdateThread
** There's a patch to tcollector for OpenTSDB
* JMX JSON: /jmx web page.  Info server needs to be running
* New UI for master in trunk.

Region load - Eugene
* Uses d3 - Javascript library for making charts - used this to create bar
graphs for regions/RS load
* Streaming data possible - moving graphs with this

HBase integration testing - Andy
* Bigtop - Do for Hadoop what Debian did for Linux
* gives you build and packaging infrastructure with RPM and DEBs
* Deployment - puppet
* Integration test - iTest
* iTest has helper functions, shell-outs for test scripting
* HBASE-6201 - mvn verify -> iTest -> MiniCluster and/or actual cluster
* Automate the RC testing matrix into this
* Add long-running ingestion tests and chaos monkeys - LoadTestTool, YCSB,
GoraCI, Gremlins
* Add full system application test cases, Kerberos setup/usage helper
functions to iTest

Benchpress - Marshall
* Programmatic load ingestion tool
* posting JSON to HTTP
* unzip tarball, run shell script
* Open source soon with Apache license

Snapshots - Jesse
* HBASE-6255
* Discussion about wall-time based or timestamp-based: now will do
non-blocking timestamp-based snapshots.
* HDFS-side implementation of snapshots - do hardlinks for high-level stuff
that aren't just files
* Looking to get this into 0.96
* Read-only or read-write?
* Still defining admin interface, etc.

Releases - Jon
* Time-based vs. feature-based releases?  What do we stop the train for?
* What else in 0.96 that we would hold the train for?  ACLv2?  Metrics2?
* Should we have some sort of minimum test standard?  Open to talking about
what goes into this.
* Performance should be in the same range.  Need to hash out what this
should look like in terms of testing.

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message