hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <jinsong...@hotmail.com>
Subject Re: ideas to improve throughput of the base writting
Date Wed, 09 Jun 2010 20:34:16 GMT
I checked the log, there are lots of

e 128.1m is >= than blocking 128.0m size
2010-06-09 17:26:36,736 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Block
ing updates for 'IPC Server handler 8 on 60020' on region 
Spam_MsgEventTable,201
0-06-09 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262: memstore 
siz
e 128.1m is >= than blocking 128.0m size

then after that there are lots of

2010-06-09 17:26:36,800 DEBUG org.apache.hadoop.hbase.regionserver.Store: 
Added
hdfs://namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/376337880/messag
e_compound_terms/7606939244559826252, entries=30869, sequenceid=8350447892, 
mems
ize=7.2m, filesize=3.4m to Spam_MsgEventTable,2010-06-09 
05:25:32\x09c873847edf6


then lots of

2010-06-09 17:26:39,005 INFO org.apache.hadoop.hbase.regionserver.HRegion: 
Unblo
cking updates for region Spam_MsgEventTable,2010-06-09 
05:25:32\x09c873847edf6e5
390477494956ec04729,1276104002262 'IPC Server handler 8 on 60020'



This cycle happens again and again in the log.   What can I do in this case 
to speed up writing ?
right now the writing speed is really slow. close to 4 rows/second for a 
regionserver.

I checked the code and try to find out why there are so many store files, 
and I noticed each second
the regionserver reports to master, it calls the memstore flush and write a 
store file.

the parameter hbase.regionserver.msginterval default value is 1 second. I am 
thinking to change to 10 second.
can that help ? I am also thinking to change hbase.hstore.blockingStoreFiles 
to 1000.  I noticed that there is a parameter
hbase.hstore.blockingWaitTime with default value of 1.5 minutes. as long as 
the 1.5 minutes is reached,
the compaction is executed. I am fine with running compaction every 1.5 
minutes, but running compaction every second
and causing CPU consistently higher than 100% is not wanted.

Any suggestion what kind of parameters to change to improve my writing speed 
?

Jimmy




--------------------------------------------------
From: "Ryan Rawson" <ryanobjc@gmail.com>
Sent: Wednesday, June 09, 2010 1:01 PM
To: <user@hbase.apache.org>
Subject: Re: ideas to improve throughput of the base writting

> The log will say something like "blocking updates to..." when you hit
> a limit.  That log you indicate is just the regionserver attempting to
> compact a region, but shouldn't prevent updates.
>
> what else does your logfile say?  Search for the string (case
> insensitive) "blocking updates"...
>
> -ryan
>
> On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu <jinsong_hu@hotmail.com> 
> wrote:
>>
>> I made this change
>> <property>
>>  <name>hbase.hstore.blockingStoreFiles</name>
>>  <value>15</value>
>> </property>
>>
>> the system is still slow.
>>
>> Here is the most recent value for the region :
>> stores=21, storefiles=186, storefileSizeMB=9681, memstoreSizeMB=128,
>> storefileIndexSizeMB=12
>>
>>
>> And the same log still happens:
>>
>> 2010-06-09 18:36:40,577 WARN org.apache.h
>> adoop.hbase.regionserver.MemStoreFlusher: Region
>> SOME_ABCEventTable,2010-06-09 0
>> 9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570 has too many 
>> store
>> fil
>> es, putting it back at the end of the flush queue.
>>
>> One idea that I have now is to further increase the
>> hbase.hstore.blockingStoreFiles to a very high
>> Number, such as 1000.  What is the negative impact of this change ?
>>
>>
>> Jimmy
>>
>>
>> --------------------------------------------------
>> From: "Ryan Rawson" <ryanobjc@gmail.com>
>> Sent: Monday, June 07, 2010 3:58 PM
>> To: <user@hbase.apache.org>
>> Subject: Re: ideas to improve throughput of the base writting
>>
>>> Try setting this config value:
>>>
>>> <property>
>>>  <name>hbase.hstore.blockingStoreFiles</name>
>>>  <value>15</value>
>>> </property>
>>>
>>> and see if that helps.
>>>
>>> The thing about the 1 compact thread is the scarce resources being
>>> preserved in this case is cluster IO.  People have had issues with
>>> compaction IO being too heavy.
>>>
>>> in your case, this setting can let the regionserver build up more
>>> store files without pausing your import.
>>>
>>> -ryan
>>>
>>> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu <jinsong_hu@hotmail.com> 
>>> wrote:
>>>>
>>>> Hi,  There:
>>>>  While saving lots of data to  on hbase, I noticed that the 
>>>> regionserver
>>>> CPU
>>>> went to more than 100%. examination shows that the hbase CompactSplit 
>>>> is
>>>> spending full time working on compacting/splitting  hbase store files.
>>>> The
>>>> machine I have is an 8 core machine. because there is only one
>>>> comact/split
>>>> thread in hbase, only one core is fully used.
>>>>  I continue to submit  map/reduce job to insert records to hbase. most 
>>>> of
>>>> the time, the job runs very fast, around 1-5 minutes. But occasionally,
>>>> it
>>>> can take 2 hours. That is very bad to me. I highly suspect that the
>>>> occasional slow insertion is related to the
>>>> insufficient speed  compactsplit thread.
>>>>  I am thinking that I should parallize the compactsplit thread, the 
>>>> code
>>>> has
>>>> this  : the for loop "for (Store store: stores.values())  " can be
>>>> parallized via java 5's threadpool , thus multiple cores are used 
>>>> instead
>>>> only one core is used. I wonder if this will help to increase the
>>>> throughput.
>>>>
>>>>  Somebody mentioned that I can increase the regionsize to that I don't 
>>>> do
>>>> so
>>>> many compaction. Under heavy writing situation.
>>>> does anybody have experience showing it helps ?
>>>>
>>>> Jimmy.
>>>>
>>>>
>>>>
>>>>  byte [] compactStores(final boolean majorCompaction)
>>>>
>>>>  throws IOException {
>>>>
>>>>  if (this.closing.get() || this.closed.get()) {
>>>>
>>>>    LOG.debug("Skipping compaction on " + this + " because
>>>> closing/closed");
>>>>
>>>>    return null;
>>>>
>>>>  }
>>>>
>>>>  splitsAndClosesLock.readLock().lock();
>>>>
>>>>  try {
>>>>
>>>>    byte [] splitRow = null;
>>>>
>>>>    if (this.closed.get()) {
>>>>
>>>>      return splitRow;
>>>>
>>>>    }
>>>>
>>>>    try {
>>>>
>>>>      synchronized (writestate) {
>>>>
>>>>        if (!writestate.compacting && writestate.writesEnabled) {
>>>>
>>>>          writestate.compacting = true;
>>>>
>>>>        } else {
>>>>
>>>>          LOG.info("NOT compacting region " + this +
>>>>
>>>>              ": compacting=" + writestate.compacting + ", 
>>>> writesEnabled="
>>>> +
>>>>
>>>>              writestate.writesEnabled);
>>>>
>>>>            return splitRow;
>>>>
>>>>        }
>>>>
>>>>      }
>>>>
>>>>      LOG.info("Starting" + (majorCompaction? " major " : " ") +
>>>>
>>>>          "compaction on region " + this);
>>>>
>>>>      long startTime = System.currentTimeMillis();
>>>>
>>>>      doRegionCompactionPrep();
>>>>
>>>>      long maxSize = -1;
>>>>
>>>>      for (Store store: stores.values()) {
>>>>
>>>>        final Store.StoreSize ss = store.compact(majorCompaction);
>>>>
>>>>        if (ss != null && ss.getSize() > maxSize) {
>>>>
>>>>          maxSize = ss.getSize();
>>>>
>>>>          splitRow = ss.getSplitRow();
>>>>
>>>>        }
>>>>
>>>>      }
>>>>
>>>>      doRegionCompactionCleanup();
>>>>
>>>>      String timeTaken =
>>>> StringUtils.formatTimeDiff(System.currentTimeMillis(),
>>>>
>>>>          startTime);
>>>>
>>>>      LOG.info("compaction completed on region " + this + " in " +
>>>> timeTaken);
>>>>
>>>>    } finally {
>>>>
>>>>      synchronized (writestate) {
>>>>
>>>>        writestate.compacting = false;
>>>>
>>>>        writestate.notifyAll();
>>>>
>>>>      }
>>>>
>>>>    }
>>>>
>>>>    return splitRow;
>>>>
>>>>  } finally {
>>>>
>>>>    splitsAndClosesLock.readLock().unlock();
>>>>
>>>>  }
>>>>
>>>>  }
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
> 

Mime
View raw message