hbase-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From ramkrishna vasudevan <ramkrishna.s.vasude...@gmail.com>
Subject Re: .META. region server DDOSed by too many clients
Date Thu, 06 Dec 2012 10:59:15 GMT
Actually when we observed that our block cache was OFF... If possible try
applying your patch and see what is happening?
If you have more memory just trying increasing the ratio allocated to block
cache?

Regards
Ralm

On Thu, Dec 6, 2012 at 4:02 PM, Varun Sharma <varun@pinterest.com> wrote:

> Hi Ram,
>
> Yes BlockCache is on but there is another in memory column which might be
> preempting the stuff from block cache. So, we might be hitting more disk
> seeks - I see that you have seen this trace before on HBASE 5898 - did that
> issue resolve things for you ?
>
> Thanks
> Varun
>
> On Wed, Dec 5, 2012 at 10:04 PM, ramkrishna vasudevan <
> ramkrishna.s.vasudevan@gmail.com> wrote:
>
> > Is block cache ON?  Check out HBASe-5898?
> >
> > Regards
> > Ram
> >
> > On Thu, Dec 6, 2012 at 9:55 AM, Anoop Sam John <anoopsj@huawei.com>
> wrote:
> >
> > >
> > > >is the META table cached just like other tables
> > > Yes Varun I think so.
> > >
> > > -Anoop-
> > > ________________________________________
> > > From: Varun Sharma [varun@pinterest.com]
> > > Sent: Thursday, December 06, 2012 6:10 AM
> > > To: user@hbase.apache.org; lars hofhansl
> > > Subject: Re: .META. region server DDOSed by too many clients
> > >
> > > We only see this on the .META. region not otherwise...
> > >
> > > On Wed, Dec 5, 2012 at 4:37 PM, Varun Sharma <varun@pinterest.com>
> > wrote:
> > >
> > > > I see but is this pointing to the fact that we are heading to disk
> for
> > > > scanning META - if yes, that would be pretty bad, no ? Currently I am
> > > > trying to see if the freeze coincides with Block Cache being full (we
> > > have
> > > > an inmemory column) - is the META table cached just like other
> tables ?
> > > >
> > > > Varun
> > > >
> > > >
> > > > On Wed, Dec 5, 2012 at 4:20 PM, lars hofhansl <lhofhansl@yahoo.com>
> > > wrote:
> > > >
> > > >> Looks like you're running into HBASE-5898.
> > > >>
> > > >>
> > > >>
> > > >> ----- Original Message -----
> > > >> From: Varun Sharma <varun@pinterest.com>
> > > >> To: user@hbase.apache.org
> > > >> Cc:
> > > >> Sent: Wednesday, December 5, 2012 3:51 PM
> > > >> Subject: .META. region server DDOSed by too many clients
> > > >>
> > > >> Hi,
> > > >>
> > > >> I am running hbase 0.94.0 and I have a significant write load being
> > put
> > > on
> > > >> a table with 98 regions on a 15 node cluster - also this write load
> > > comes
> > > >> from a very large number of clients (~ 1000). I am running with 10
> > > >> priority
> > > >> IPC handlers and 200 IPC handlers. It seems the region server
> holding
> > > >> .META
> > > >> is DDOSed. All the 200 handlers are busy serving the .META. region
> and
> > > >> they
> > > >> are all locked onto on object. The Jstack is here for the regoin
> > server
> > > >>
> > > >> "IPC Server handler 182 on 60020" daemon prio=10
> > tid=0x00007f329872c800
> > > >> nid=0x4401 waiting on condition [0x00007f328807f000]
> > > >>    java.lang.Thread.State: WAITING (parking)
> > > >>         at sun.misc.Unsafe.park(Native Method)
> > > >>         - parking to wait for  <0x0000000542d72e30> (a
> > > >> java.util.concurrent.locks.ReentrantLock$NonfairSync)
> > > >>         at
> > > >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> > > >>         at
> > > >>
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.ConcurrentHashMap$Segment.put(ConcurrentHashMap.java:445)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> java.util.concurrent.ConcurrentHashMap.putIfAbsent(ConcurrentHashMap.java:925)
> > > >>         at
> > > >> org.apache.hadoop.hbase.util.IdLock.getLockEntry(IdLock.java:71)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:290)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(HFileBlockIndex.java:213)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:455)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.generalizedSeek(KeyValueHeap.java:299)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.reseek(KeyValueHeap.java:244)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521)
> > > >>         - locked <0x000000063b4965d0> (a
> > > >> org.apache.hadoop.hbase.regionserver.StoreScanner)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402)
> > > >>         - locked <0x000000063b4965d0> (a
> > > >> org.apache.hadoop.hbase.regionserver.StoreScanner)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.KeyValueHeap.next(KeyValueHeap.java:127)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextInternal(HRegion.java:3354)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3310)
> > > >>         - locked <0x0000000523c211e0> (a
> > > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.next(HRegion.java:3327)
> > > >>         - locked <0x0000000523c211e0> (a
> > > >> org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl)
> > > >>         at
> > > >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4066)
> > > >>         at
> > > >> org.apache.hadoop.hbase.regionserver.HRegion.get(HRegion.java:4039)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.regionserver.HRegionServer.get(HRegionServer.java:1941)
> > > >>
> > > >> The client side trace shows that we are looking for META region.
> > > >>
> > > >> thrift-worker-3499" daemon prio=10 tid=0x00007f789dd98800 nid=0xb52
> > > >> waiting
> > > >> for monitor entry [0x00007f778672d000]
> > > >>    java.lang.Thread.State: BLOCKED (on object monitor)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:943)
> > > >>         - waiting to lock <0x0000000707978298> (a java.lang.Object)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:836)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1482)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1367)
> > > >>         at
> > org.apache.hadoop.hbase.client.HTable.batch(HTable.java:729)
> > > >>         - locked <0x000000070821d5a0> (a
> > > >> org.apache.hadoop.hbase.client.HTable)
> > > >>         at
> org.apache.hadoop.hbase.client.HTable.get(HTable.java:698)
> > > >>         at
> > > >>
> > > >>
> > >
> >
> org.apache.hadoop.hbase.client.HTablePool$PooledHTable.get(HTablePool.java:371)
> > > >>
> > > >> On the RS page, I see 68 million read requests for the META region
> > while
> > > >> for the other 98 regions - we have done like 20 million write
> requests
> > > in
> > > >> total - regions have not moved around at all and no crashes have
> > > happened.
> > > >> Why do we have such an incredible number of scans over META and is
> > there
> > > >> something I can do about this issue ?
> > > >>
> > > >> Varun
> > > >>
> > > >>
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message