hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jean-Daniel Cryans" <jdcry...@gmail.com>
Subject Re: HBase went down during insertion test
Date Mon, 18 Aug 2008 12:32:28 GMT
Jay,

Here are a few questions/tips.

What do you mean by "1 HBase" in your test environment? The main components
are the Master and the Regionservers.

You are using 0.2.0 and you checked that it worked by HQL? HQL was dropped
since that version, do you mean that you used the ruby shell?

I see you checked your status with a lot of tools, but have you checked
Hadoop and HBase logs? Can we see them?

You don't have to sleep when you insert data in a row, it will sleep enough
inside of HBase if needed.

I'm a little confused about your OS setup and your goal. You said you wanted
to test performances but you are using a VM, isn't it contradictory? Also
you're inserting keys sequentially; because the row keys are sorted through
out the cluster, that means you will always hit a single machine with your
inserts.

Thank you for having a look at these.

J-D

On Mon, Aug 18, 2008 at 7:39 AM, 김재현 <everydayminder@gmail.com> wrote:

> Hello, everyone.
>
> I tried to insert 100,000,000 rows into a HBase table, however, HBase
> stops its operation whenever it inserted
> about 270,000 rows. How could I insert 100,000,000 rows of data without
> errors? Is there important configuration that I'm missing?
>
> (I set up HDFS with 1 master and 1 slave cluster as Michael G. Noll
> posted and my HBase is working on the HBase.
> I've tested its working by HQL and some simple programs after checking
> their log files.)
>
> 1. Test environment
> * OS : Ubuntu 7.10
> * Virutal Machines (VMWare)
> - 1 HBase
> - 1 master and 3 slaves of HDFS (at first, I started test with 1 master
> and 1 slave of HDFS)
> * Hadoop version : 0.17.1
> * HBase version : 0.2.0
> * Application Options
> - HEAP_SIZE = 2000M
> - JAVA_HEAP_MAX = 2000M
>
> 2. What I wanted to test
> * how long it takes when inserting 10,000 rows of data
> * how long it takes when inserting 100,000 rows of data
> * how long it takes when inserting 100,000,000 rows of data
>
> In the middle of tests, after processing about 210,000~370,000 out of
> 100,000,000 rows, it went frozen.
> All processes of HBase and Hadoop are still on and they accept TCP
> connections but didn't work properly.
>
>
> 3. What I've checked
> * Out of memory error
> - No error occurred
> * Number of sockets
> - I checked the number of open sockets with netstat command and it
> seemed normal. However, when it hung, the number of received and send
> queues remained still.
> * TCP responses from the processes
> - I tried to connect to the HBase server using telnet command. HBase
> server accepted TCP requests even when it went frozen.
> * Any other error messages from the processes
> - Error messages such as timeout and socket errors showed up some time
> later
>
> 4. How to evalute the performance
> - single threaded program based on the example introduced on the HBase
> wiki.
> - iterate 100,000,000 times of the operation
>
> HBaseConfiguration config = new HBaseConfiguration();
> HTable table = new HTable(config, "jstable");
> String header = "tk";
> String key = null;
>
> BatchUpdate batchUpdate = null;
>
> for (int i = 0; i < 100000000; i++) {
> key = "tk" + String.valueOf(i);
> batchUpdate = new BatchUpdate(key);
> if (i % 10000 == 0) Util.tlog("$ trace : " + String.valueOf(i));
> batchUpdate.put("col0:", "val0".getBytes());
> batchUpdate.put("col1:", "val1".getBytes());
> batchUpdate.put("col2:", "val2".getBytes());
> batchUpdate.put("col3:", "val3".getBytes());
> batchUpdate.put("col4:", "val4".getBytes());
> batchUpdate.put("col5:", "val5".getBytes());
> batchUpdate.put("col6:", "val6".getBytes());
> batchUpdate.put("col7:", "val7".getBytes());
> batchUpdate.put("col8:", "val8".getBytes());
> table.commit(batchUpdate);
> Util.sleep(50); // Thread.sleep()
> }
> }
>
> - put some delay after insertion
> Thread.sleep() method doesn't make difference anyway. At first, I didn't
> put Thread.sleep() method there. (Should I use it when I insert lots of
> data in a row?)
>
> 5. Additional Test
> * After failing insertion test of 100,000,000 rows of data, I repeated
> inserting 100,000 rows of data sveral times.
> And HBase proccesses went frozen again after inserting about
> 200,000~250,000 rows of data.
>
> * My HDFS started with 1 master and 1 slave. And I tried the same
> experiments with 1 master and 3 slaves HDFS but the results are similar.
>
>
> Jay.
>
>
>
>
>
>
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message