accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Schwartz <abigbears...@gmail.com>
Subject Re: Query Services Layer Question
Date Wed, 18 Jun 2014 14:09:01 GMT
Hi Guys,

I have updated my code to use apache commons-pool2 for connection pooling.
 In this implementation, each connector has it's own zookeeper instance...

Now my code looks like this:

public void readTable(...) {
    Connector connector = null;
    try {
       connector = accumuloConnectionPool.getConnector();
       Scanner scanner = connector.getScanner(tableName, auths);
       Scanner.setRange(range);
       for (Map.Entry<Key,Value> entry : scanner) {
          ...
       }
       scanner.close();
     } finally {
       accumuloConnectionPool.releaseConnector(connector);
     }
}

I've built a plugin for using Accumulo from the Play Framework (
www.playframework.org).  If the above implementation looks good, I'll be
happy to publish a blog article about it.

Josh - It was nice meeting you at the Accumulo Summit last week.

Sincerely,
Jeff Schwartz


On Mon, May 19, 2014 at 10:45 PM, Josh Elser <josh.elser@gmail.com> wrote:

> Hi Jeff,
>
> Not a rookie question at all. This is an area in the API where we know we
> could make the lifecycle more obvious. We have a ticket somewhere for it.
>
> If you're using a single user/password to connect to Accumulo (not using
> special accounts per your QSL client), there's no reason you can't reuse
> Connectors. The number of Connectors you want to cache is likely relative
> to the concurrent user load of your service.
>
> The fun part here is that each Connector retains a reference to the
> Instance which it uses internally. There are synchronized calls inside each
> ZooKeeperInstance which may start to degrade when you get above maybe 50
> concurrent threads accessing it (ballpark guess).
>
> You also do not want to create a new ZooKeeperInstance for every request
> as you're doing now as I believe it will cause you some issues in Java heap
> due to some nitty-gritty ZooKeeper details (ask if you're actually curious).
>
> In summary, definitely cache ZooKeeperInstances, but use some number
> relative to the number of users. Connectors can be cached too, but share
> Instances under the hoods. Using HTTP benchmarking tools with various
> client pool sizes like JMeter should help you balance out these numbers.
>
> Hope this helps.
>
> - Josh
>
>
> On 5/19/14, 10:29 PM, Jeff Schwartz wrote:
>
>> Rookie Question...  I've built a Query Service Layer (QSL) according to
>> the documentation from the Accumulo v1.6.0 User Manual.  My question is
>> how often should I be getting a Zoo Keeper Instance and Connector to
>> accumulo.  For example, here's some psuedo code for a typical service in
>> my QSL.
>>
>> public void readTable(...) {
>>      Instance instance = new ZooKeeperInstance(accumuloInstanceName,
>> zooServers);
>>      Connector connector = instance.getConnector(username,
>> passwordToken);
>>      Scanner scanner = connector.getScanner(tableName, auths);
>>      Scanner.setRange(range);
>>      for (Map.Entry<Key,Value> entry : scanner) {
>>        ...
>>      }
>>      scanner.close();
>> }
>>
>> If I do these lines of code for every call in my restful service, then I
>> feel like that is generating a lot of extra connections to both
>> zookeeper and accumulo.  Additionally, I would assume that that will
>> have a negative impact on performance.  Should I cache any connectors or
>> ZooKeeper instances?
>>
>> Any suggestions or best practices would be greatly appreciated.
>>
>> Thanks in advance.
>>
>> Sincerely,
>> Jeff Schwartz
>>
>

Mime
View raw message