hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jinsong Hu" <jinsong...@hotmail.com>
Subject Re: ideas to improve throughput of the base writting
Date Wed, 09 Jun 2010 23:58:57 GMT
My hardware has 2 disks. I did a file copy on the machine and found that I 
can get 300 mbyte/second.

At this time, I see my insertion is less than 1k/second. my row size is . in 
terms of disk writing. my record
insertion rate is far less than the hardware limit.  my row size is about 
10K byte

if in your i7-based server, you are doing 200k row/sec, each row is 200 
byte, then you are doing 40M byte/sec.

in my case, if it behaves normally, I can get 100 row/sec * 10K byte =1M 
/sec.
that is far from the disk speed. occasionally I can see 1k row/second. which 
is more reasonable in my case,
but I rarely get that performance.

even worse, with the change done, now I have seem lots of compaction 
failure:

2010-06-09 23:40:55,117 ERROR 
org.apache.hadoop.hbase.regionserver.CompactSplitT
hread: Compaction failed for region Spam_MsgEventTable,2010-06-09 
20:05:20\x0905
860d4bf1cb268ef69391cf97de9f64,1276121160527
java.lang.RuntimeException: java.io.IOException: Could not find target 
position
65588
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS
canner.java:61)
        at 
org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.j
ava:79)
        at 
org.apache.hadoop.hbase.regionserver.MinorCompactingStoreScanner.next
(MinorCompactingStoreScanner.java:96)
        at 
org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:920)
        at 
org.apache.hadoop.hbase.regionserver.Store.compact(Store.java:764)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja
va:832)
        at 
org.apache.hadoop.hbase.regionserver.HRegion.compactStores(HRegion.ja
va:785)
        at 
org.apache.hadoop.hbase.regionserver.CompactSplitThread.run(CompactSp
litThread.java:93)
Caused by: java.io.IOException: Could not find target position 65588
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchBlockAt(DFSClien
t.java:1556)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient
.java:1666)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1
780)
        at java.io.DataInputStream.read(DataInputStream.java:132)
        at 
org.apache.hadoop.hbase.io.hfile.BoundedRangeFileInputStream.read(Bou
ndedRangeFileInputStream.java:105)
        at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:100)
        at 
org.apache.hadoop.hbase.io.hfile.HFile$Reader.decompress(HFile.java:1
018)
        at 
org.apache.hadoop.hbase.io.hfile.HFile$Reader.readBlock(HFile.java:96
6)
        at 
org.apache.hadoop.hbase.io.hfile.HFile$Reader$Scanner.next(HFile.java
:1159)
        at 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.next(StoreFileS
canner.java:58)
        ... 7 more

I can't stop this unless I restarted the regionserver. After restart I 
truncate the table, and when I list the table again in shell,
it appears 2 times. now I can't even disable the table and drop it.

I will restart the whole hbase cluster and report the progress.

Jimmy/





--------------------------------------------------
From: "Ryan Rawson" <ryanobjc@gmail.com>
Sent: Wednesday, June 09, 2010 4:16 PM
To: <user@hbase.apache.org>
Subject: Re: ideas to improve throughput of the base writting

> Hey,
>
> Sounds like you are hitting limits of your hardware... I dont think
> you mentioned the hardware spec you are running in this thread...
>
> What you are seeing is essentially the limits of HDFS's ability to
> take writes.  The errors might be due to various HDFS setup problems
> (eg: xceiver count, file handle count, all outlined in various HBase
> "startup" docs)... But the overall performance might be limited by
> your hardware.
>
> For example, I use i7-based servers with 4 disks.  This gives a
> reasonable IO bandwidth, and can cope with high rates of inserts (upto
> 100-200k rows/sec (each row is ~ 100-300 bytes).  If you are running a
> 1 or 2 disk system it is possible you are hitting limits of what your
> hardware can do.
>
> Also note that the write-pipeline performance is ultimately defined in
> bytes/sec, not just 'rows/sec'... thus my rows were small, and if
> yours are big then you might be hitting a lower 'row/sec' limit even
> though the amount of bytes you are writing is higher than what i might
> have been doing.
>
>
>
> On Wed, Jun 9, 2010 at 3:59 PM, Jinsong Hu <jinsong_hu@hotmail.com> wrote:
>> I still get lots of repetition of
>>
>> 2010-06-09 22:54:38,428 WARN
>> org.apache.hadoop.hbase.regionserver.MemStoreFlushe
>> r: Region Spam_MsgEventTable,2010-06-09
>> 20:05:20\x0905860d4bf1cb268ef69391cf97de
>> 9f64,1276121160527 has too many store files, putting it back at the end 
>> of
>> the f
>> lush queue.
>> 2010-06-09 22:54:38,428 DEBUG
>> org.apache.hadoop.hbase.regionserver.CompactSplitT
>> hread: Compaction requested for region Spam_MsgEventTable,2010-06-09
>> 20:05:20\x0
>> 905860d4bf1cb268ef69391cf97de9f64,1276121160527/1537478401 because:
>> regionserver
>> /10.110.8.88:60020.cacheFlusher
>>
>>
>> I also saw lots of
>>
>> 2010-06-09 22:50:12,527 INFO 
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> Block
>> ing updates for 'IPC Server handler 1 on 60020' on region
>> Spam_MsgEventTable,201
>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527: 
>> memstore
>> siz
>> e 512.0m is >= than blocking 512.0m size
>> 2010-06-09 22:50:12,598 INFO 
>> org.apache.hadoop.hbase.regionserver.HRegion:
>> Block
>> ing updates for 'IPC Server handler 5 on 60020' on region
>> Spam_MsgEventTable,201
>> 0-06-09 20:05:20\x0905860d4bf1cb268ef69391cf97de9f64,1276121160527: 
>> memstore
>> siz
>> e 512.0m is >= than blocking 512.0m size
>>
>> even with the changed config. the regionserver has 4G ram.  what else can 
>> be
>> wrong ?
>>
>> The insertion rate is still not good.
>>
>> Jimmy.
>>
>>
>> --------------------------------------------------
>> From: "Jinsong Hu" <jinsong_hu@hotmail.com>
>> Sent: Wednesday, June 09, 2010 1:59 PM
>> To: <user@hbase.apache.org>
>> Subject: Re: ideas to improve throughput of the base writting
>>
>>> Thanks. I will make this change:
>>>
>>> <property>
>>>  <name>hbase.hregion.memstore.block.multiplier</name>
>>>  <value>8</value>
>>> </property>
>>>
>>> <property>
>>>  <name>hbase.regionserver.msginterval</name>
>>>  <value>10000</value>
>>> </property>
>>>
>>>  <property>
>>>   <name>hbase.hstore.compactionThreshold</name>
>>>   <value>6</value>
>>> </property>
>>>
>>>
>>> <property>
>>>  <name>hbase.hstore.blockingStoreFiles</name>
>>>  <value>18</value>
>>> </property>
>>>
>>>
>>> and see how it goes.
>>>
>>>
>>> Jimmy.
>>>
>>> --------------------------------------------------
>>> From: "Ryan Rawson" <ryanobjc@gmail.com>
>>> Sent: Wednesday, June 09, 2010 1:49 PM
>>> To: <user@hbase.apache.org>
>>> Subject: Re: ideas to improve throughput of the base writting
>>>
>>>> More background here... you are running into a situation where the
>>>> regionserver cannot flush fast enough and the size of the region's
>>>> memstore has climbed too high and thus you get that error message.
>>>> HBase attempts to protect itself by holding up clients (thus causing
>>>> the low performance you see).  By expanding how big a memstore can get
>>>> during times of stress you can improve performance, at the cost of
>>>> memory usage. That is what that setting is about.
>>>>
>>>> As for the 1.5 minute setting, that is the maximal amount of time a
>>>> handler thread will block for.  You shouldn't need to tweak that
>>>> value, and reducing it could cause issues.
>>>>
>>>> Now, as for compacting, HBase will compact small files into larger
>>>> files, and on a massive upload you can expect to see this happen
>>>> constantly, thus tying up 1 cpu worth on your regionserver.  You could
>>>> potentially reduce that by increasing the value:
>>>>
>>>>  <property>
>>>>   <name>hbase.hstore.compactionThreshold</name>
>>>>   <value>3</value>
>>>>
>>>> the value is interpreted as "if there are more than 3 files for a
>>>> region then run the compaction check".  By raising this limit you can
>>>> accumulate more files before compacting them, thus reducing the
>>>> frequency of compactions but also potentially increasing the
>>>> performance of reads (more files to read = more seeks = slower).  I'd
>>>> consider setting it to 5-7 or so in concert with setting
>>>> "hbase.hstore.blockingStoreFiles" to a value at least 2x that.
>>>>
>>>> All of these settings increase the amount of ram your regionserver may
>>>> need, so you will want to ensure you have at least 4000m of ram set in
>>>> hbase-env.sh.  This is why they are set so conservatively in the
>>>> default shipping config.
>>>>
>>>> These are the 3 important settings that control how often compactions
>>>> occur and how RPC threads get blocked.  Try tweaking all of them and
>>>> let me know if you are doing better.
>>>>
>>>> -ryan
>>>>
>>>>
>>>> On Wed, Jun 9, 2010 at 1:37 PM, Ryan Rawson <ryanobjc@gmail.com> wrote:
>>>>>
>>>>> you also want this config:
>>>>>
>>>>> <property>
>>>>>  <name>hbase.hregion.memstore.block.multiplier</name>
>>>>>  <value>8</value>
>>>>> </property>
>>>>>
>>>>>
>>>>> that should hopefully clear things up.
>>>>>
>>>>> -ryan
>>>>>
>>>>> On Wed, Jun 9, 2010 at 1:34 PM, Jinsong Hu <jinsong_hu@hotmail.com>
>>>>> wrote:
>>>>>>
>>>>>> I checked the log, there are lots of
>>>>>>
>>>>>> e 128.1m is >= than blocking 128.0m size
>>>>>> 2010-06-09 17:26:36,736 INFO
>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>> Block
>>>>>> ing updates for 'IPC Server handler 8 on 60020' on region
>>>>>> Spam_MsgEventTable,201
>>>>>> 0-06-09 05:25:32\x09c873847edf6e5390477494956ec04729,1276104002262:
>>>>>> memstore
>>>>>> siz
>>>>>> e 128.1m is >= than blocking 128.0m size
>>>>>>
>>>>>> then after that there are lots of
>>>>>>
>>>>>> 2010-06-09 17:26:36,800 DEBUG
>>>>>> org.apache.hadoop.hbase.regionserver.Store:
>>>>>> Added
>>>>>>
>>>>>> hdfs://namenodes1.cloud.ppops.net:8020/hbase/Spam_MsgEventTable/376337880/messag
>>>>>> e_compound_terms/7606939244559826252, entries=30869,
>>>>>> sequenceid=8350447892,
>>>>>> mems
>>>>>> ize=7.2m, filesize=3.4m to Spam_MsgEventTable,2010-06-09
>>>>>> 05:25:32\x09c873847edf6
>>>>>>
>>>>>>
>>>>>> then lots of
>>>>>>
>>>>>> 2010-06-09 17:26:39,005 INFO
>>>>>> org.apache.hadoop.hbase.regionserver.HRegion:
>>>>>> Unblo
>>>>>> cking updates for region Spam_MsgEventTable,2010-06-09
>>>>>> 05:25:32\x09c873847edf6e5
>>>>>> 390477494956ec04729,1276104002262 'IPC Server handler 8 on 60020'
>>>>>>
>>>>>>
>>>>>>
>>>>>> This cycle happens again and again in the log.   What can I do in

>>>>>> this
>>>>>> case
>>>>>> to speed up writing ?
>>>>>> right now the writing speed is really slow. close to 4 rows/second

>>>>>> for
>>>>>> a
>>>>>> regionserver.
>>>>>>
>>>>>> I checked the code and try to find out why there are so many store
>>>>>> files,
>>>>>> and I noticed each second
>>>>>> the regionserver reports to master, it calls the memstore flush and
>>>>>> write a
>>>>>> store file.
>>>>>>
>>>>>> the parameter hbase.regionserver.msginterval default value is 1 
>>>>>> second.
>>>>>> I am
>>>>>> thinking to change to 10 second.
>>>>>> can that help ? I am also thinking to change
>>>>>> hbase.hstore.blockingStoreFiles
>>>>>> to 1000.  I noticed that there is a parameter
>>>>>> hbase.hstore.blockingWaitTime with default value of 1.5 minutes.
as
>>>>>> long as
>>>>>> the 1.5 minutes is reached,
>>>>>> the compaction is executed. I am fine with running compaction every

>>>>>> 1.5
>>>>>> minutes, but running compaction every second
>>>>>> and causing CPU consistently higher than 100% is not wanted.
>>>>>>
>>>>>> Any suggestion what kind of parameters to change to improve my 
>>>>>> writing
>>>>>> speed
>>>>>> ?
>>>>>>
>>>>>> Jimmy
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --------------------------------------------------
>>>>>> From: "Ryan Rawson" <ryanobjc@gmail.com>
>>>>>> Sent: Wednesday, June 09, 2010 1:01 PM
>>>>>> To: <user@hbase.apache.org>
>>>>>> Subject: Re: ideas to improve throughput of the base writting
>>>>>>
>>>>>>> The log will say something like "blocking updates to..." when
you 
>>>>>>> hit
>>>>>>> a limit.  That log you indicate is just the regionserver attempting

>>>>>>> to
>>>>>>> compact a region, but shouldn't prevent updates.
>>>>>>>
>>>>>>> what else does your logfile say?  Search for the string (case
>>>>>>> insensitive) "blocking updates"...
>>>>>>>
>>>>>>> -ryan
>>>>>>>
>>>>>>> On Wed, Jun 9, 2010 at 11:52 AM, Jinsong Hu <jinsong_hu@hotmail.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I made this change
>>>>>>>> <property>
>>>>>>>>  <name>hbase.hstore.blockingStoreFiles</name>
>>>>>>>>  <value>15</value>
>>>>>>>> </property>
>>>>>>>>
>>>>>>>> the system is still slow.
>>>>>>>>
>>>>>>>> Here is the most recent value for the region :
>>>>>>>> stores=21, storefiles=186, storefileSizeMB=9681, 
>>>>>>>> memstoreSizeMB=128,
>>>>>>>> storefileIndexSizeMB=12
>>>>>>>>
>>>>>>>>
>>>>>>>> And the same log still happens:
>>>>>>>>
>>>>>>>> 2010-06-09 18:36:40,577 WARN org.apache.h
>>>>>>>> adoop.hbase.regionserver.MemStoreFlusher: Region
>>>>>>>> SOME_ABCEventTable,2010-06-09 0
>>>>>>>> 9:56:56\x093dc01b4d2c4872963717d80d8b5c74b1,1276107447570
has too
>>>>>>>> many
>>>>>>>> store
>>>>>>>> fil
>>>>>>>> es, putting it back at the end of the flush queue.
>>>>>>>>
>>>>>>>> One idea that I have now is to further increase the
>>>>>>>> hbase.hstore.blockingStoreFiles to a very high
>>>>>>>> Number, such as 1000.  What is the negative impact of this
change ?
>>>>>>>>
>>>>>>>>
>>>>>>>> Jimmy
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------
>>>>>>>> From: "Ryan Rawson" <ryanobjc@gmail.com>
>>>>>>>> Sent: Monday, June 07, 2010 3:58 PM
>>>>>>>> To: <user@hbase.apache.org>
>>>>>>>> Subject: Re: ideas to improve throughput of the base writting
>>>>>>>>
>>>>>>>>> Try setting this config value:
>>>>>>>>>
>>>>>>>>> <property>
>>>>>>>>>  <name>hbase.hstore.blockingStoreFiles</name>
>>>>>>>>>  <value>15</value>
>>>>>>>>> </property>
>>>>>>>>>
>>>>>>>>> and see if that helps.
>>>>>>>>>
>>>>>>>>> The thing about the 1 compact thread is the scarce resources
being
>>>>>>>>> preserved in this case is cluster IO.  People have had
issues with
>>>>>>>>> compaction IO being too heavy.
>>>>>>>>>
>>>>>>>>> in your case, this setting can let the regionserver build
up more
>>>>>>>>> store files without pausing your import.
>>>>>>>>>
>>>>>>>>> -ryan
>>>>>>>>>
>>>>>>>>> On Mon, Jun 7, 2010 at 3:52 PM, Jinsong Hu 
>>>>>>>>> <jinsong_hu@hotmail.com>
>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,  There:
>>>>>>>>>>  While saving lots of data to  on hbase, I noticed
that the
>>>>>>>>>> regionserver
>>>>>>>>>> CPU
>>>>>>>>>> went to more than 100%. examination shows that the
hbase
>>>>>>>>>> CompactSplit
>>>>>>>>>> is
>>>>>>>>>> spending full time working on compacting/splitting
 hbase store
>>>>>>>>>> files.
>>>>>>>>>> The
>>>>>>>>>> machine I have is an 8 core machine. because there
is only one
>>>>>>>>>> comact/split
>>>>>>>>>> thread in hbase, only one core is fully used.
>>>>>>>>>>  I continue to submit  map/reduce job to insert records
to hbase.
>>>>>>>>>> most
>>>>>>>>>> of
>>>>>>>>>> the time, the job runs very fast, around 1-5 minutes.
But
>>>>>>>>>> occasionally,
>>>>>>>>>> it
>>>>>>>>>> can take 2 hours. That is very bad to me. I highly
suspect that 
>>>>>>>>>> the
>>>>>>>>>> occasional slow insertion is related to the
>>>>>>>>>> insufficient speed  compactsplit thread.
>>>>>>>>>>  I am thinking that I should parallize the compactsplit
thread, 
>>>>>>>>>> the
>>>>>>>>>> code
>>>>>>>>>> has
>>>>>>>>>> this  : the for loop "for (Store store: stores.values())
 " can 
>>>>>>>>>> be
>>>>>>>>>> parallized via java 5's threadpool , thus multiple
cores are used
>>>>>>>>>> instead
>>>>>>>>>> only one core is used. I wonder if this will help
to increase the
>>>>>>>>>> throughput.
>>>>>>>>>>
>>>>>>>>>>  Somebody mentioned that I can increase the regionsize
to that I
>>>>>>>>>> don't
>>>>>>>>>> do
>>>>>>>>>> so
>>>>>>>>>> many compaction. Under heavy writing situation.
>>>>>>>>>> does anybody have experience showing it helps ?
>>>>>>>>>>
>>>>>>>>>> Jimmy.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>  byte [] compactStores(final boolean majorCompaction)
>>>>>>>>>>
>>>>>>>>>>  throws IOException {
>>>>>>>>>>
>>>>>>>>>>  if (this.closing.get() || this.closed.get()) {
>>>>>>>>>>
>>>>>>>>>>  LOG.debug("Skipping compaction on " + this + " because
>>>>>>>>>> closing/closed");
>>>>>>>>>>
>>>>>>>>>>  return null;
>>>>>>>>>>
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  splitsAndClosesLock.readLock().lock();
>>>>>>>>>>
>>>>>>>>>>  try {
>>>>>>>>>>
>>>>>>>>>>  byte [] splitRow = null;
>>>>>>>>>>
>>>>>>>>>>  if (this.closed.get()) {
>>>>>>>>>>
>>>>>>>>>>    return splitRow;
>>>>>>>>>>
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  try {
>>>>>>>>>>
>>>>>>>>>>    synchronized (writestate) {
>>>>>>>>>>
>>>>>>>>>>      if (!writestate.compacting && writestate.writesEnabled)
{
>>>>>>>>>>
>>>>>>>>>>        writestate.compacting = true;
>>>>>>>>>>
>>>>>>>>>>      } else {
>>>>>>>>>>
>>>>>>>>>>        LOG.info("NOT compacting region " + this +
>>>>>>>>>>
>>>>>>>>>>            ": compacting=" + writestate.compacting
+ ",
>>>>>>>>>> writesEnabled="
>>>>>>>>>> +
>>>>>>>>>>
>>>>>>>>>>            writestate.writesEnabled);
>>>>>>>>>>
>>>>>>>>>>          return splitRow;
>>>>>>>>>>
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>>    }
>>>>>>>>>>
>>>>>>>>>>    LOG.info("Starting" + (majorCompaction? " major
" : " ") +
>>>>>>>>>>
>>>>>>>>>>        "compaction on region " + this);
>>>>>>>>>>
>>>>>>>>>>    long startTime = System.currentTimeMillis();
>>>>>>>>>>
>>>>>>>>>>    doRegionCompactionPrep();
>>>>>>>>>>
>>>>>>>>>>    long maxSize = -1;
>>>>>>>>>>
>>>>>>>>>>    for (Store store: stores.values()) {
>>>>>>>>>>
>>>>>>>>>>      final Store.StoreSize ss = store.compact(majorCompaction);
>>>>>>>>>>
>>>>>>>>>>      if (ss != null && ss.getSize() >
maxSize) {
>>>>>>>>>>
>>>>>>>>>>        maxSize = ss.getSize();
>>>>>>>>>>
>>>>>>>>>>        splitRow = ss.getSplitRow();
>>>>>>>>>>
>>>>>>>>>>      }
>>>>>>>>>>
>>>>>>>>>>    }
>>>>>>>>>>
>>>>>>>>>>    doRegionCompactionCleanup();
>>>>>>>>>>
>>>>>>>>>>    String timeTaken =
>>>>>>>>>> StringUtils.formatTimeDiff(System.currentTimeMillis(),
>>>>>>>>>>
>>>>>>>>>>        startTime);
>>>>>>>>>>
>>>>>>>>>>    LOG.info("compaction completed on region " + this
+ " in " +
>>>>>>>>>> timeTaken);
>>>>>>>>>>
>>>>>>>>>>  } finally {
>>>>>>>>>>
>>>>>>>>>>    synchronized (writestate) {
>>>>>>>>>>
>>>>>>>>>>      writestate.compacting = false;
>>>>>>>>>>
>>>>>>>>>>      writestate.notifyAll();
>>>>>>>>>>
>>>>>>>>>>    }
>>>>>>>>>>
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  return splitRow;
>>>>>>>>>>
>>>>>>>>>>  } finally {
>>>>>>>>>>
>>>>>>>>>>  splitsAndClosesLock.readLock().unlock();
>>>>>>>>>>
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>  }
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
> 

Mime
View raw message