hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stack <st...@duboce.net>
Subject Re: More on Large scale Scans...
Date Wed, 28 Jul 2010 18:45:48 GMT
On Wed, Jul 28, 2010 at 10:30 AM, Vidhyashankar Venkataraman
<vidhyash@yahoo-inc.com> wrote:
> I reran my Scans on a 180 node cluster (with 160 RS's/data nodes/task trackers) using
TableInputFormat with block cache disabled and I got the following results/observations:
>
>  1.  90% of the job completed in roughly 5 and a half hours which comes to 29 MBps
per node (counting 160 RS's).. Is this reasonable? Can I get better results?
>


29MB/s doesn't sound too bad.

How many disks?  What can you get from the system measuring disk
throughput directly?



>    2.  The last 10% of the job has still been running for more than 3 hours.. I am
trying to find some common thread of reasoning why this is happening.
>

So, whats happening?  Is this reduce?  In the reduce, are you loading
hbase?  Is it loading hbase that is taking the time?  Are tasks
failing?  If so, which ones?  Can you figure which part of the table
they are hitting when they fail?


>    3. I tried disabling the block cache in the MR job (I was using TableMapReduceUtil:
see code snippet below) and through hbase shell just to change default options (disable the
data and change using alter options)..
>
>  conf.set(TableInputFormat.SCAN_CACHEBLOCKS, "false");
>    Job job = new Job(conf, "Scan" + "_" + tableName);
>    ..........
>    Scan scan = new Scan();
>    scan.addColumns("bigColumn");
>    scan.setCaching(10);
>    scan.setCacheBlocks(false);                      // Setting it again..
I guess this will override the previous setting..
>    // Second argument is the table name.
>    TableMapReduceUtil.initTableMapperJob(tableName, scan,
>      MapperHB.class, Text.class, NullWritable.class, job);
>
> But I still get LruBlockcache stats like the following ones: Is this okay?
>
> 2010-07-27 14:13:40,852 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1304510, Hit=1914, Miss=1302596, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.1467217574827373%, Miss Ratio=99.8532772064209%, Evicted/Run=238.943939208984382010-07-27
14:14:40,860 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=1950.0773MB
(2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504), Counts: Blocks=1914,
Access=1306103, Hit=1914, Miss=1304189, Evictions=3300, Evicted=788515, Ratios: Hit Ratio=0.14654280385002494%,
Miss Ratio=99.85345602035522%, Evicted/Run=238.943939208984382010-07-27 14:15:40,853 DEBUG
org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=1950.0773MB (2044804168),
Free=449.7541MB (471601336), Max=2399.8313MB (2516405504), Counts: Blocks=1914, Access=1307783,
Hit=1914, Miss=1305869, Evictions=3300, Evicted=788515, Ratios: Hit Ratio=0.14635456027463078%,
Miss Ratio=99.8536467552185%, Evicted/Run=238.94393920898438
> ................
> .................


Block cache is global to the regionserver. Is any other table being
accessed out of this job (e.g. .META. or -ROOT- tables?)?

The above shows that LRU has been hit about 2k times over the course
of a minute.  If above is from host carrying .META., thats probably
OK.  If its from some other host, and no other table is being
accessed, your efforts at disabling block caching do not seem to be
having an effect.


> 2010-07-27 22:45:40,853 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1348427, Hit=1914, Miss=1346513, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14194316463544965%, Miss Ratio=99.85805749893188%, Evicted/Run=238.94393920898438
> 2010-07-27 22:46:40,854 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1348427, Hit=1914, Miss=1346513, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14194316463544965%, Miss Ratio=99.85805749893188%, Evicted/Run=238.94393920898438
> 2010-07-27 22:47:40,853 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1348427, Hit=1914, Miss=1346513, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14194316463544965%, Miss Ratio=99.85805749893188%, Evicted/Run=238.94393920898438
> 2010-07-27 22:48:40,853 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1348427, Hit=1914, Miss=1346513, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14194316463544965%, Miss Ratio=99.85805749893188%, Evicted/Run=238.94393920898438
> 2010-07-27 22:49:40,876 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1348659, Hit=1914, Miss=1346745, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14191875234246254%, Miss Ratio=99.8580813407898%, Evicted/Run=238.94393920898438
> 2010-07-27 22:50:41,140 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1349089, Hit=1914, Miss=1347175, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.1418735133484006%, Miss Ratio=99.85812902450562%, Evicted/Run=238.94393920898438
>

Hmm... on the other hand, not much going on in the block cache -- not
enough to worry about... numbers are staying pretty constant over
time.


>
>    4.   A minor number of region servers (8 out of 150) crash (more of a graceful
shutdown) while performing the scan and I saw these messages just before the shutdown started:
>
> 2010-07-27 22:52:57,448 WARN org.apache.zookeeper.ClientCnxn: Exception closing session
0x2a0190c221002c to sun.nio.ch.SelectionKeyImpl@286e4365
> java.io.IOException: TIMED OUT
>        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:906)
> 2010-07-27 22:52:57,454 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1349486, Hit=1914, Miss=1347572, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14183177845552564%, Miss Ratio=99.85817074775696%, Evicted/Run=238.94393920898438
> 2010-07-27 22:52:57,454 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats:
Sizes: Total=1950.0773MB (2044804168), Free=449.7541MB (471601336), Max=2399.8313MB (2516405504),
Counts: Blocks=1914, Access=1349486, Hit=1914, Miss=1347572, Evictions=3300, Evicted=788515,
Ratios: Hit Ratio=0.14183177845552564%, Miss Ratio=99.85817074775696%, Evicted/Run=238.94393920898438
> 2010-07-27 22:52:57,548 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder,
call next(-30527588457879943, 30) from 98.136.191.88:33914: output error
> 2010-07-27 22:52:57,548 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder,
call next(-7358804533352185527, 30) from 98.136.191.88:33963: output error
> 2010-07-27 22:52:57,602 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder,
call next(-3402818986992405350, 30) from 98.136.191.88:33927: output error
> 2010-07-27 22:52:57,603 WARN org.apache.hadoop.ipc.HBaseServer: IPC Server Responder,
call next(-6635348671997204137, 30) from 98.136.191.88:33916: output error
> 2010-07-27 22:52:57,603 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: Got
ZooKeeper event, state: Disconnected, type: None, path: null
> 2010-07-27 22:52:57,603 INFO org.apache.hadoop.ipc.HBaseServer: IPC Server handler 21
on 60020 caught: java.nio.channels.ClosedChannelException
> ....................
> .....................
> 2010-07-27 22:52:58,059 WARN org.apache.hadoop.hbase.regionserver.HRegionServer: Attempt=1
> org.apache.hadoop.hbase.Leases$LeaseStillHeldException
>


You should up your zk session timeout given the type of job you are
running.  Up the session timeout in hbase-site.xml.  See:

  <property>
    <name>zookeeper.session.timeout</name>
    <value>60000</value>
    <description>ZooKeeper session timeout.
      HBase passes this to the zk quorum as suggested maximum time for a
      session.  See
http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions
      "The client sends a requested timeout, the server responds with the
      timeout that it can give the client. "
      In milliseconds.
    </description>
  </property>

Read that URL.  Says "The current implementation requires that the
timeout be a minimum of 2 times the tickTime (as set in the server
configuration) and a maximum of 20 times the tickTime. "  so you need
to up hbase.zookeeper.property.tickTime too... from its current
setting of '3'.  Set it to 6 or 9 even.

St.Ack


> Thanks in advance,
> Vidhya
>
> On 7/26/10 2:43 PM, "Vidhyashankar Venkataraman" <vidhyash@yahoo-inc.com> wrote:
>
> I am trying to assess the performance of Scans on a 100TB db on 180 nodes running Hbase
0.20.5..
>
> I run a sharded scan (each Map task runs a scan on a specific range: speculative execution
is turned false so that there is no duplication in tasks) on a fully compacted table...
>
> 1 MB block size, Block cache enabled.. Max of 2 tasks per node..  Each row is 30 KB
in size: 1 big column family with just one field..
> Region lease timeout is set to an hour.. And I don't get any socket timeout exceptions
so I have not reassigned the write socket timeout...
>
> I ran experiments on the following cases:
>
>  1.  The client level cache is set to 1 (default: got he number using getCaching):
The MR tasks take around 13 hours to finish in the average.. Which gives around 13.17 MBps
per node. The worst case is 34 hours (to finish the entire job)...
>  2.  Client cache set to 20 rows: this is much worse than the previous case: we get
around a super low 1MBps per node...
>
>         Question: Should I set it to a value such that the block size is a multiple
of the above said cache size? Or the cache size to a much lower value?
>
> I find that these numbers are much less than the ones I get when it's running with just
a few nodes..
>
> Can you guys help me with this problem?
>
> Thank you
> Vidhya
>
>

Mime
View raw message