accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1833) MultiTableBatchWriterImpl.getBatchWriter() is not performant for multiple threads
Date Fri, 01 Nov 2013 23:58:18 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13811789#comment-13811789
] 

Josh Elser commented on ACCUMULO-1833:
--------------------------------------

This has been a fun exercise for me. Short answer -- just cache the BatchWriters that MTBW
gives you.

Long answer: you can work around a bit of the ZooCache code, but it defeats the purpose of
the class a bit. For the sake of the argument, we can go down this path. The first thing you'll
run into with multiple threads using the same MultiTableBatchWriter is that you'll spend a
large amount of time (relative to the underlying ZooCache methods) contending over the synchronized
methods (get, notably) on ZooCache.

1. o.a.a.f.z.ZooCache#getInstance(String, int): With ThreadLocals, you can get multiple instances
of ZooCache to avoid the lock contention on the single instance of ZooCache by having an instance
of ZooCache per thread. This removes the thread contention on ZooCache.get in Tables.getMap(Instance,
boolean).

2. Next, you'll find that the MultiTableBatchWriterImpl keeps a handle to the original Instance
from the Connector the MTBWI was created from. ZooKeeperInstance has, you guessed it, an instance
of ZooCache. Inside of Tables.getMap(Instance, boolean), you'll see that there are also calls
to ZooUtil.getRoot(Instance). So, again, you have a single instance of ZooCache being shared
across multiple threads and you have the same synchronization penalty. Again, you can get
around this with a bit of funny-business in MTBWI by making a new Instance in a ThreadLocal
so each thread has its own Instance and thus ZooCache.

3. Next, you'll get down into ZooCache.retry(ZooRunnable)! This will trace down to actually
get the o.a.z.ZooKeeper object. This then calls a static synchronized method (noooooo) which
introduces the same thread contention. Changing o.a.a.f.z.ZooSession's 'sessions' member from
a HashMap to a ConcurrentHashMap and removing the synchronized from the getSessions method
finally gets us full thread utilization.

!https://issues.apache.org/jira/secure/attachment/12611720/ZooKeeperThreadUtilization.png!

I haven't fully wrapped my head around #3 to guess as to whether or not it's actually safe
to do what I did, but it at least does what I expected it to :). Point still stands, a lot
of this introduces a fair bit of unintuitive complexity that probably isn't worth it. For
your case, cache the BatchWriters in your application. For the general case, if you have multiple
threads in the same JVM accessing the same Accumulo instance, it is likely a good idea to
create multiple ZooKeeperInstances as all objects created from the Connector from that Instance
likely share the same ZooCache instance. I would guess that it wasn't initially intended to
have multiple threads accessing the same Instance in high rapidity.

> MultiTableBatchWriterImpl.getBatchWriter() is not performant for multiple threads
> ---------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-1833
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1833
>             Project: Accumulo
>          Issue Type: Improvement
>    Affects Versions: 1.5.0, 1.6.0
>            Reporter: Chris McCubbin
>         Attachments: ACCUMULO-1833-test.patch, ZooKeeperThreadUtilization.png
>
>
> This issue comes from profiling our application. We have a MultiTableBatchWriter created
by normal means. I am attempting to write to it with multiple threads by doing things like
the following:
> {code}
> batchWriter.getBatchWriter(table).addMutations(mutations);
> {code}
> In my test with 4 threads writing to one table, this call is quite inefficient and results
in a large performance degradation over a single BatchWriter.
> I believe the culprit is the fact that the call is synchronized. Also there is the possibility
that the zookeeper call to Tables.getTableState on every call is negatively affecting performance:
> {code}
>   @Override
>   public synchronized BatchWriter getBatchWriter(String tableName) throws AccumuloException,
AccumuloSecurityException, TableNotFoundException {
>     ArgumentChecker.notNull(tableName);
>     String tableId = Tables.getNameToIdMap(instance).get(tableName);
>     if (tableId == null)
>       throw new TableNotFoundException(tableId, tableName, null);
>     
>     if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
>       throw new TableOfflineException(instance, tableId);
>     
>     BatchWriter tbw = tableWriters.get(tableId);
>     if (tbw == null) {
>       tbw = new TableBatchWriter(tableId);
>       tableWriters.put(tableId, tbw);
>     }
>     return tbw;
>   }
> {code}
> I recommend moving the synchronized block to happen only if the batchwriter is not present,
and also only checking if the table is online at that time:
> {code}
>   @Override
>   public BatchWriter getBatchWriter(String tableName) throws AccumuloException, AccumuloSecurityException,
TableNotFoundException {
>     ArgumentChecker.notNull(tableName);
>     String tableId = Tables.getNameToIdMap(instance).get(tableName);
>     if (tableId == null)
>       throw new TableNotFoundException(tableId, tableName, null);
>     BatchWriter tbw = tableWriters.get(tableId);
>     if (tbw == null) {
>       if (Tables.getTableState(instance, tableId) == TableState.OFFLINE)
>           throw new TableOfflineException(instance, tableId);
>       tbw = new TableBatchWriter(tableId);
>       synchronized(tableWriters){
>           //only create a new table writer if we haven't been beaten to it.
>           if (tableWriters.get(tableId) == null)      
>               tableWriters.put(tableId, tbw);
>       }
>     }
>     return tbw;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message