accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Valentin <ar...@arielvalentin.com>
Subject Re: Synchronized Access to ZooCache Causing Threads to Block
Date Wed, 12 Feb 2014 20:10:50 GMT
Josh,

The symptom is that we hit a point where a single server seems
"unresponsive" but we do not see anything unusual going on in that machine
and it seems idol. No heavy CPU, no I/O wait, low load average; however
when we add additional instances of the JVM our capacity seems to increase
linearly.

Based on thread dumps and profiler stats it appears that under "heavy" load
most of our threads are blocked trying to access ZooCache.


Ariel Valentin
e-mail: ariel@arielvalentin.com
website: http://blog.arielvalentin.com
skype: ariel.s.valentin
twitter: arielvalentin
linkedin: http://www.linkedin.com/profile/view?id=8996534
---------------------------------------
*simplicity *communication
*feedback *courage *respect


On Wed, Feb 12, 2014 at 1:41 PM, Josh Elser <josh.elser@gmail.com> wrote:

> Didn't mean to ask about the subject matter, but how you were using the
> API. Are you actually seeing contention on ZooCache?
>
>
> On 2/12/14, 1:19 PM, Ariel Valentin wrote:
>
>> Sorry but I am not at liberty to be specific about our business problem.
>>
>> Typical usage is multiple clients writing data to tables, which scan to
>> avoid duplicate entries.
>>
>> Ariel Valentin
>> e-mail: ariel@arielvalentin.com <mailto:ariel@arielvalentin.com>
>> website: http://blog.arielvalentin.com
>> skype: ariel.s.valentin
>> twitter: arielvalentin
>> linkedin: http://www.linkedin.com/profile/view?id=8996534
>> ---------------------------------------
>> *simplicity *communication
>> *feedback *courage *respect
>>
>>
>> On Wed, Feb 12, 2014 at 10:59 AM, Josh Elser <josh.elser@gmail.com
>> <mailto:josh.elser@gmail.com>> wrote:
>>
>>     Also, I forgot this part before:
>>
>>     The ZooCache instance that's used *typically* comes from the
>>     Instance object that your Connector was created from. In other
>>     words, if you create multiple Instances (ZooKeeperInstance,
>>     usually), you can get multiple ZooCaches which means that concurrent
>>     calls to methods off of those objects should not block one another
>>     (createScanner off of connector1 from instance1 should not block
>>     createScanner off of connector2 from instance2).
>>
>>     That should be something quick you can play with if you so desire.
>>
>>
>>     On 2/12/14, 9:57 AM, Josh Elser wrote:
>>
>>         Yep, you'll likely also block on BatchScanner, anything in
>>         TableOperations, and a host of other things.
>>
>>         For scanners, there's likely a standing recommendation to
>>         amortize the
>>         use of those objects (if you want to look up 5 range, don't make 5
>>         scanners).
>>
>>         Creating a cache per member in the work would likely require
>>         some kind
>>         of paxos implementation to provide consistency which is highly
>>         undesirable.
>>
>>         One thing I'm curious about is the impact of removing ZooCache
>>         altogether from things like the client api and see what happens.
>>         I don't
>>         have a good way to measure that impact off the top of my head
>>         though.
>>
>>         Anyways, is this causing you problems in your usage of the api?
>>         Could
>>         you elaborate a bit more on the specifics?
>>
>>         On Feb 12, 2014 4:48 AM, "Ariel Valentin"
>>         <ariel@arielvalentin.com <mailto:ariel@arielvalentin.com>
>>         <mailto:ariel@arielvalentin.__com
>>
>>         <mailto:ariel@arielvalentin.com>>> wrote:
>>
>>              I have run into a problem related to ACCUMULO-1833, which
>>         appears to
>>              have addressed the issue for MutliTableBatchWriter; however
>>         I am
>>              seeing this issue on the scanner side also:
>>
>>              394750-"http-/192.168.220.196
>>         <http://192.168.220.196>:__8080-35" daemon prio=10
>>
>>              tid=0x00007f3108038000 nid=0x538a waiting for monitor entry
>>              [0x00007f31287d1000]
>>
>>              394878:   java.lang.Thread.State: BLOCKED (on object monitor)
>>
>>              394933- at
>>
>>         org.apache.accumulo.fate.__zookeeper.ZooCache.__
>> getInstance(ZooCache.java:301)
>>
>>
>>              395012- - waiting to lock <0x00000000fa64f5b8> (a
>>         java.lang.Class
>>              for org.apache.accumulo.fate.__zookeeper.ZooCache)
>>
>>              395120- at
>>
>>         org.apache.accumulo.core.__client.impl.Tables.__
>> getZooCache(Tables.java:40)
>>
>>              395196- at
>>
>>         org.apache.accumulo.core.__client.impl.Tables.getMap(__
>> Tables.java:44)
>>
>>              395267- at
>>
>>         org.apache.accumulo.core.__client.impl.Tables.__
>> getNameToIdMap(Tables.java:78)
>>
>>              395346- at
>>
>>         org.apache.accumulo.core.__client.impl.Tables.getTableId(
>> __Tables.java:64)
>>
>>              395421- at
>>
>>         org.apache.accumulo.core.__client.impl.ConnectorImpl.__
>> getTableId(ConnectorImpl.java:__75)
>>
>>              395510- at
>>
>>         org.apache.accumulo.core.__client.impl.ConnectorImpl.__
>> createScanner(ConnectorImpl.__java:137)
>>
>>
>>              I have not spent enough time reasoning about the code to
>>         understand
>>              all of the nuances but I am interested in knowing if there
>>         are any
>>              mitigating strategies for dealing with this thread
>>         contention e.g.
>>              would creating a cache entry for each member of the Zookeeper
>>              ensemble help relieve the strain? use multiple
>>         classloaders? or is
>>              my only option to spawn multiple JVMs?
>>
>>              Thanks,
>>
>>              Ariel Valentin
>>              e-mail: ariel@arielvalentin.com
>>         <mailto:ariel@arielvalentin.com>
>>         <mailto:ariel@arielvalentin.__com <mailto:ariel@arielvalentin.com
>> >>
>>
>>
>>              website: http://blog.arielvalentin.com
>>              skype: ariel.s.valentin
>>              twitter: arielvalentin
>>              linkedin: http://www.linkedin.com/__profile/view?id=8996534
>>         <http://www.linkedin.com/profile/view?id=8996534>
>>              ------------------------------__---------
>>
>>              *simplicity *communication
>>              *feedback *courage *respect
>>
>>
>>

Mime
View raw message