accumulo-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Newton <>
Subject Re: maximum number of connectors
Date Tue, 01 Dec 2015 13:49:44 GMT
To be clear, Connector is just a pairing of username/authentication with an
instance.  There are no connections or other resources involved.  Sure,
there's some memory needed to remember those bits of information, but it's
just a few bytes (ok, like 1K).

Create a batch scanner, though, and there's threads, cached connections to
the tablet servers, etc. Now you've started to use some precious resources.

So, if you have 3M users, expect to load-balance their requests, not
because of Connector objects, but because of the requests they will make.

The "superuser approach" is common. Well, a "queryuser" who has access to
read some set of tables, but is limited using the appropriate
authorizations for a specific real-life person. For example, you have a
"doctor" user, who can read patient data. However, Dr. Smith needs the
authorization "eric.newton" to read my information. When a doctor makes a
request, the query infrastructure looks up their authorizations and applies
it to their request. The doctor tables are available to any doctor, but
they can only read my data if the system adds the authorizations for my
data.  But this is just management of authorizations, and not client

This does decrease security a bit: if your application of authorizations
(eric.newton) to real-life person (Dr. Smith) is incorrect, those bugs
might allow unauthorized access.  If this is a concern, you can partition
your data to put highly sensitive data into a table that requires a more
restrictive user.  For example, you might put credit card information into
a table available only to a user with a need for that information: most
requests will be done by a different user to different tables.

Scanners, and more specifically, the client-side iterators they create, are
the resource hogs.  Everything else is bookkeeping.


On Tue, Dec 1, 2015 at 4:11 AM, mohit.kaushik <>

> Josh,
> If resources is a concern, would it be better to use superuser approach ,
> single user having all authorizations assigned and using scanner to provide
> user authorizations. Does it decreases the security level? How does the
> custom authenticator and authorizers help in this case?  and how can I
> implement them if needed?
> Thanks
> Mohit Kaushik
> On 11/30/2015 08:59 PM, Josh Elser wrote:
> Connector is tied to a specific user, so you're tied to a user for a given
> instance.
> I'm not aware of any testing in that direction (lots of active
> connectors). Connectors aren't particularly heavy, you could keep some
> cache of recently used instances and recreate them when they were evicted
> from the cache due to inactivity.
> The only fundamental limitation of concurrent Connector instances that I
> can think of is at the RPC level. Eventually, the RPCs that the Connector
> is making to Accumulo servers correlates to server-side resources which are
> finite. If you have some reasonable hardware, I don't think this is a real
> concern.
> Would be curious to hear back how this works.
> mohit.kaushik wrote:
> I am creating a connector per user as every user has different
> authorizations sets. I want to know, is there any limit on creating
> Accumulo connectors, what is the maximum number of connector that
> Accumulo can handle?. For example if My application will have 3M users,
> Is it correct to create 3M connections for them or there is any way to
> share connections for different users having different authorizations?
> Thanks
> Mohit Kaushik

View raw message