accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Josh Elser <josh.el...@gmail.com>
Subject Re: Synchronized Access to ZooCache Causing Threads to Block
Date Wed, 12 Feb 2014 20:13:57 GMT
Great, that helps. Thanks for the info, Ariel!

I think this might be an area we want to revisit in later versions of 
Accumulo to make the client API implementations a little more robust and 
supportive of concurrent usage.

On 2/12/14, 3:10 PM, Ariel Valentin wrote:
> Josh,
>
> The symptom is that we hit a point where a single server seems
> "unresponsive" but we do not see anything unusual going on in that
> machine and it seems idol. No heavy CPU, no I/O wait, low load average;
> however when we add additional instances of the JVM our capacity seems
> to increase linearly.
>
> Based on thread dumps and profiler stats it appears that under "heavy"
> load most of our threads are blocked trying to access ZooCache.
>
>
> Ariel Valentin
> e-mail: ariel@arielvalentin.com <mailto:ariel@arielvalentin.com>
> website: http://blog.arielvalentin.com
> skype: ariel.s.valentin
> twitter: arielvalentin
> linkedin: http://www.linkedin.com/profile/view?id=8996534
> ---------------------------------------
> *simplicity *communication
> *feedback *courage *respect
>
>
> On Wed, Feb 12, 2014 at 1:41 PM, Josh Elser <josh.elser@gmail.com
> <mailto:josh.elser@gmail.com>> wrote:
>
>     Didn't mean to ask about the subject matter, but how you were using
>     the API. Are you actually seeing contention on ZooCache?
>
>
>     On 2/12/14, 1:19 PM, Ariel Valentin wrote:
>
>         Sorry but I am not at liberty to be specific about our business
>         problem.
>
>         Typical usage is multiple clients writing data to tables, which
>         scan to
>         avoid duplicate entries.
>
>         Ariel Valentin
>         e-mail: ariel@arielvalentin.com <mailto:ariel@arielvalentin.com>
>         <mailto:ariel@arielvalentin.__com <mailto:ariel@arielvalentin.com>>
>         website: http://blog.arielvalentin.com
>         skype: ariel.s.valentin
>         twitter: arielvalentin
>         linkedin: http://www.linkedin.com/__profile/view?id=8996534
>         <http://www.linkedin.com/profile/view?id=8996534>
>         ------------------------------__---------
>         *simplicity *communication
>         *feedback *courage *respect
>
>
>         On Wed, Feb 12, 2014 at 10:59 AM, Josh Elser
>         <josh.elser@gmail.com <mailto:josh.elser@gmail.com>
>         <mailto:josh.elser@gmail.com <mailto:josh.elser@gmail.com>>> wrote:
>
>              Also, I forgot this part before:
>
>              The ZooCache instance that's used *typically* comes from the
>              Instance object that your Connector was created from. In other
>              words, if you create multiple Instances (ZooKeeperInstance,
>              usually), you can get multiple ZooCaches which means that
>         concurrent
>              calls to methods off of those objects should not block one
>         another
>              (createScanner off of connector1 from instance1 should not
>         block
>              createScanner off of connector2 from instance2).
>
>              That should be something quick you can play with if you so
>         desire.
>
>
>              On 2/12/14, 9:57 AM, Josh Elser wrote:
>
>                  Yep, you'll likely also block on BatchScanner, anything in
>                  TableOperations, and a host of other things.
>
>                  For scanners, there's likely a standing recommendation to
>                  amortize the
>                  use of those objects (if you want to look up 5 range,
>         don't make 5
>                  scanners).
>
>                  Creating a cache per member in the work would likely
>         require
>                  some kind
>                  of paxos implementation to provide consistency which is
>         highly
>                  undesirable.
>
>                  One thing I'm curious about is the impact of removing
>         ZooCache
>                  altogether from things like the client api and see what
>         happens.
>                  I don't
>                  have a good way to measure that impact off the top of
>         my head
>                  though.
>
>                  Anyways, is this causing you problems in your usage of
>         the api?
>                  Could
>                  you elaborate a bit more on the specifics?
>
>                  On Feb 12, 2014 4:48 AM, "Ariel Valentin"
>                  <ariel@arielvalentin.com
>         <mailto:ariel@arielvalentin.com>
>         <mailto:ariel@arielvalentin.__com <mailto:ariel@arielvalentin.com>>
>                  <mailto:ariel@arielvalentin.
>         <mailto:ariel@arielvalentin.>____com
>
>                  <mailto:ariel@arielvalentin.__com
>         <mailto:ariel@arielvalentin.com>>>> wrote:
>
>                       I have run into a problem related to
>         ACCUMULO-1833, which
>                  appears to
>                       have addressed the issue for
>         MutliTableBatchWriter; however
>                  I am
>                       seeing this issue on the scanner side also:
>
>                       394750-"http-/192.168.220.196 <http://192.168.220.196>
>                  <http://192.168.220.196>:____8080-35" daemon prio=10
>
>                       tid=0x00007f3108038000 nid=0x538a waiting for
>         monitor entry
>                       [0x00007f31287d1000]
>
>                       394878:   java.lang.Thread.State: BLOCKED (on
>         object monitor)
>
>                       394933- at
>
>
>         org.apache.accumulo.fate.____zookeeper.ZooCache.____getInstance(ZooCache.java:301)
>
>
>                       395012- - waiting to lock <0x00000000fa64f5b8> (a
>                  java.lang.Class
>                       for org.apache.accumulo.fate.____zookeeper.ZooCache)
>
>                       395120- at
>
>
>         org.apache.accumulo.core.____client.impl.Tables.____getZooCache(Tables.java:40)
>
>                       395196- at
>
>
>         org.apache.accumulo.core.____client.impl.Tables.getMap(____Tables.java:44)
>
>                       395267- at
>
>
>         org.apache.accumulo.core.____client.impl.Tables.____getNameToIdMap(Tables.java:78)
>
>                       395346- at
>
>
>         org.apache.accumulo.core.____client.impl.Tables.getTableId(____Tables.java:64)
>
>                       395421- at
>
>
>         org.apache.accumulo.core.____client.impl.ConnectorImpl.____getTableId(ConnectorImpl.java:____75)
>
>                       395510- at
>
>
>         org.apache.accumulo.core.____client.impl.ConnectorImpl.____createScanner(ConnectorImpl.____java:137)
>
>
>                       I have not spent enough time reasoning about the
>         code to
>                  understand
>                       all of the nuances but I am interested in knowing
>         if there
>                  are any
>                       mitigating strategies for dealing with this thread
>                  contention e.g.
>                       would creating a cache entry for each member of
>         the Zookeeper
>                       ensemble help relieve the strain? use multiple
>                  classloaders? or is
>                       my only option to spawn multiple JVMs?
>
>                       Thanks,
>
>                       Ariel Valentin
>                       e-mail: ariel@arielvalentin.com
>         <mailto:ariel@arielvalentin.com>
>                  <mailto:ariel@arielvalentin.__com
>         <mailto:ariel@arielvalentin.com>>
>                  <mailto:ariel@arielvalentin.
>         <mailto:ariel@arielvalentin.>____com
>         <mailto:ariel@arielvalentin.__com <mailto:ariel@arielvalentin.com>>>
>
>
>                       website: http://blog.arielvalentin.com
>                       skype: ariel.s.valentin
>                       twitter: arielvalentin
>                       linkedin:
>         http://www.linkedin.com/____profile/view?id=8996534
>         <http://www.linkedin.com/__profile/view?id=8996534>
>                  <http://www.linkedin.com/__profile/view?id=8996534
>         <http://www.linkedin.com/profile/view?id=8996534>>
>                       ------------------------------____---------
>
>                       *simplicity *communication
>                       *feedback *courage *respect
>
>
>

Mime
View raw message