accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ariel Valentin <ar...@arielvalentin.com>
Subject Re: Synchronized Access to ZooCache Causing Threads to Block
Date Fri, 14 Feb 2014 02:53:01 GMT
It may be an issue with our table design. We have two tables; one of the
tables contains related entities that need to be purged before updating the
parent entity.

Ariel Valentin
e-mail: ariel@arielvalentin.com
website: http://blog.arielvalentin.com
skype: ariel.s.valentin
twitter: arielvalentin
linkedin: http://www.linkedin.com/profile/view?id=8996534
---------------------------------------
*simplicity *communication
*feedback *courage *respect


On Wed, Feb 12, 2014 at 1:42 PM, William Slacum <
wilhelm.von.cloud@accumulo.net> wrote:

> FWIW you can probably avoid the scan by making your insert idempotent
> aside from the timestamp and let versioning handle deduplication.
>
>
> On Wed, Feb 12, 2014 at 1:19 PM, Ariel Valentin <ariel@arielvalentin.com>wrote:
>
>> Sorry but I am not at liberty to be specific about our business problem.
>>
>> Typical usage is multiple clients writing data to tables, which scan to
>> avoid duplicate entries.
>>
>> Ariel Valentin
>> e-mail: ariel@arielvalentin.com
>>
>> website: http://blog.arielvalentin.com
>> skype: ariel.s.valentin
>> twitter: arielvalentin
>> linkedin: http://www.linkedin.com/profile/view?id=8996534
>> ---------------------------------------
>> *simplicity *communication
>> *feedback *courage *respect
>>
>>
>> On Wed, Feb 12, 2014 at 10:59 AM, Josh Elser <josh.elser@gmail.com>wrote:
>>
>>> Also, I forgot this part before:
>>>
>>> The ZooCache instance that's used *typically* comes from the Instance
>>> object that your Connector was created from. In other words, if you create
>>> multiple Instances (ZooKeeperInstance, usually), you can get multiple
>>> ZooCaches which means that concurrent calls to methods off of those objects
>>> should not block one another (createScanner off of connector1 from
>>> instance1 should not block createScanner off of connector2 from instance2).
>>>
>>> That should be something quick you can play with if you so desire.
>>>
>>>
>>> On 2/12/14, 9:57 AM, Josh Elser wrote:
>>>
>>>> Yep, you'll likely also block on BatchScanner, anything in
>>>> TableOperations, and a host of other things.
>>>>
>>>> For scanners, there's likely a standing recommendation to amortize the
>>>> use of those objects (if you want to look up 5 range, don't make 5
>>>> scanners).
>>>>
>>>> Creating a cache per member in the work would likely require some kind
>>>> of paxos implementation to provide consistency which is highly
>>>> undesirable.
>>>>
>>>> One thing I'm curious about is the impact of removing ZooCache
>>>> altogether from things like the client api and see what happens. I don't
>>>> have a good way to measure that impact off the top of my head though.
>>>>
>>>> Anyways, is this causing you problems in your usage of the api? Could
>>>> you elaborate a bit more on the specifics?
>>>>
>>>> On Feb 12, 2014 4:48 AM, "Ariel Valentin" <ariel@arielvalentin.com
>>>> <mailto:ariel@arielvalentin.com>> wrote:
>>>>
>>>>     I have run into a problem related to ACCUMULO-1833, which appears to
>>>>     have addressed the issue for MutliTableBatchWriter; however I am
>>>>     seeing this issue on the scanner side also:
>>>>
>>>>     394750-"http-/192.168.220.196:8080-35" daemon prio=10
>>>>     tid=0x00007f3108038000 nid=0x538a waiting for monitor entry
>>>>     [0x00007f31287d1000]
>>>>
>>>>     394878:   java.lang.Thread.State: BLOCKED (on object monitor)
>>>>
>>>>     394933- at
>>>>     org.apache.accumulo.fate.zookeeper.ZooCache.
>>>> getInstance(ZooCache.java:301)
>>>>
>>>>     395012- - waiting to lock <0x00000000fa64f5b8> (a java.lang.Class
>>>>     for org.apache.accumulo.fate.zookeeper.ZooCache)
>>>>
>>>>     395120- at
>>>>     org.apache.accumulo.core.client.impl.Tables.
>>>> getZooCache(Tables.java:40)
>>>>
>>>>     395196- at
>>>>     org.apache.accumulo.core.client.impl.Tables.getMap(Tables.java:44)
>>>>
>>>>     395267- at
>>>>     org.apache.accumulo.core.client.impl.Tables.
>>>> getNameToIdMap(Tables.java:78)
>>>>
>>>>     395346- at
>>>>     org.apache.accumulo.core.client.impl.Tables.getTableId(
>>>> Tables.java:64)
>>>>
>>>>     395421- at
>>>>     org.apache.accumulo.core.client.impl.ConnectorImpl.
>>>> getTableId(ConnectorImpl.java:75)
>>>>
>>>>     395510- at
>>>>     org.apache.accumulo.core.client.impl.ConnectorImpl.
>>>> createScanner(ConnectorImpl.java:137)
>>>>
>>>>     I have not spent enough time reasoning about the code to understand
>>>>     all of the nuances but I am interested in knowing if there are any
>>>>     mitigating strategies for dealing with this thread contention e.g.
>>>>     would creating a cache entry for each member of the Zookeeper
>>>>     ensemble help relieve the strain? use multiple classloaders? or is
>>>>     my only option to spawn multiple JVMs?
>>>>
>>>>     Thanks,
>>>>
>>>>     Ariel Valentin
>>>>     e-mail: ariel@arielvalentin.com <mailto:ariel@arielvalentin.com>
>>>>
>>>>     website: http://blog.arielvalentin.com
>>>>     skype: ariel.s.valentin
>>>>     twitter: arielvalentin
>>>>     linkedin: http://www.linkedin.com/profile/view?id=8996534
>>>>     ---------------------------------------
>>>>     *simplicity *communication
>>>>     *feedback *courage *respect
>>>>
>>>>
>>
>

Mime
View raw message