cassandra-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Coe, Robin" <>
Subject RE: Re: bandwidth limiting Cassandra's replication and access control
Date Thu, 12 Nov 2009 15:28:29 GMT
I'm not sure JAAS is the way to go when implementing a performant 
authentication/authorization service.  This is what threw me off in the first place.

JAAS is a framework that allows for sequential authentication using multiple login modules.
 As each login module authenticates, it passes control to the next module, in a chain.  You
don't need multiple login modules but I point this out to show that JAAS is built for functionality,
not performance.

As each module authenticates, you create Principals of a custom type.  If you need a specific
Principal, you need to retrieve it using a cast.  There are other aspects that effectively
make the JAAS API less performant that a custom solution.

On top of JAAS is the LDAP integration, probably done with JNDI.  Again, not the most performant

So, my concern is, and has been since this discussion started, is that Cassandra should not
be performing this work.  I suggest that the Thrift connection is opened with credentials
passed in, which Cassandra authenticates to.  Even this overhead is not something I would
want to incur on every connection, so I would use a connection pool, with the connections
pre-authenticated to a single account that's appropriate for my application.

Basically, the authentication on the Cassandra side needs to be lightning fast and performing
LDAP lookups from Cassandra as each Thrift socket is opened will definitely impact performance.
 I suggest a decent model to follow is what relational systems do; they have their own accounts
that can be used to authenticate.  What we need is a datastore in Cassandra for secure password
storage, possibly with a one-way hash.  That way, we can send the password over an unencrypted


-----Original Message-----
From: news [] On Behalf Of Ted Zlatanov
Sent: November 12, 2009 9:59 AM
Subject: Re: bandwidth limiting Cassandra's replication and access control

On Thu, 12 Nov 2009 12:40:05 +1100 Ian Holsman <> wrote: 

IH> most places i've seen don't use DB auth anywhere. there is a common
IH> login, stored in a property file, sometimes stored in a internally- 
IH> world-readable SVN repo.

In my current industry (financials) this is not acceptable.  It puts
money and jobs at risk to open access this way.

IH> they usually use network ACLs to restrict access to good hosts. (jump
IH> hosts). network ACLs have been tested for decades and they work.
IH> implementing your own auth is just asking for problems. It's too hard
IH> to do properly, and will probably never work well with the enterprises
IH> existing auth systems.

Layers of security are always a good idea (any firewall is just a part
of good security design, and by itself only increases complacency).  I
should mention I've been a sysadmin and network admin for many years
besides doing programming.

No one is suggesting to implement our own authentication.  We're going
to use existing mechanisms, namely what JAAS supports (LDAP, NIS,
etc.).  We're creating a specific authorization mechanism because it
makes sense for Cassandra, but we're again using JAAS to do that.

IH> If you have sensitive data being stored, ENCRYPT it, or use a 1-way
IH> hash instead of storing it.  Ideally with a user-supplied key which
IH> is not stored anywhere on disk.

This is not feasible in many cases.  Encryption is slow and very hard to
implement properly.  One-way hashes lose the original content,
obviously.  User-supplied keys require interactivity at least at some
point, which is annoying and makes reliable operation harder to
achieve.  Fast access to the data is very important and my proposal
(initial login followed by an auth token passed around) is a decent
solution to these concerns.

IH> sadly DBA's are people too, and it is pathetically easy for them to
IH> get all the data from a DB-dump.

Securing backups is, fortunately, much easier to address on the server
side because it deals with static data.


View raw message