hadoop-common-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Apache Wiki <wikidi...@apache.org>
Subject [Hadoop Wiki] Trivial Update of "Hbase/FAQ" by stack
Date Mon, 28 Jan 2008 16:12:17 GMT
Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:

The comment on the change is:
mapfile math

  Currently Hbase is a file handle glutton.  Running an Hbase loaded w/ more than a few regions,
its possible to blow past the common 1024 default file handle limit for the user running the
process.  Running out of file handles is like an OOME, things start to fail in strange ways.
 To up the users' file handles, edit '''/etc/security/limits.conf''' on all nodes and restart
your cluster.
+ The math runs roughly as follows: Per column family, there is at least one mapfile and possibly
up to 5 or 6 if a region is under load (lets say 3 per column family on average).  Multiply
by the number of regions per region server.  So, for example, say you have a schema of 3 column
familes per region and that you have 100 regions per regionserver, the JVM will open 3 * 3
* 100 mapfiles -- 900 file descriptors not counting open jar files, conf files, etc (Run 'lsof
-p REGIONSERVER_PID' to see for sure).
  '''6. [[Anchor(6)]] What can I do to improve hbase performance?'''
  A configuration that can help with random reads at some cost in memory is making the '''hbase.io.index.interval'''
smaller.  By default when hbase writes store files, it adds an entry to the mapfile index
on every 32nd addition (For hadoop, default is every 128th addition).  Adding entries more
frequently -- every 16th or every 8th -- will make it so there is less seeking around looking
for the wanted entry but at the cost of a hbase carrying a larger index (Indices are read
into memory on mapfile open; by default there are one to five or so mapfiles per column family
per region loaded into a regionserver).

View raw message