couchdb-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shorin <kxe...@gmail.com>
Subject [IDEA] Tracking CouchDB users sessions
Date Tue, 03 Mar 2015 14:07:55 GMT
Hi devs!

Yesterday on IRC, again, was raised a question about how to get to
know active user sessions on CouchDB. We all know that CouchDB known
nothing about that, even how much active cookies there are around. So
here is the idea how to fix that.

Problem
=======

CouchDB has many ways to auth the users: basic auth, cookies, proxy
auth, oauth, facebook (via plugin), etc. And only cookie auth can hold
some sort of state, but we in any case cannot relay on the information
that comes outside, only those which CouchDB could trust. So we need
to hold session list internally.

Proposal
========

To track user sessions we need some sort of table with the fields:
user_id, last_activity_time. However, session must expire after some
time, so somehow this table have to be cleaned from the records where
last_activity_time + timeout < now. Using ets tables for that task is
hard. However, we could use Erlang process instead.

When authed user sends first request to CouchDB, supervisor checks if
user_id exists in his private ets table. If it's not, new Erlang
process get spawned and his PID recorded in the relation with the
user_id in ets table. This process (session) holds a state for the
three values: userCtx, timestamp (last_activity_time) and timeout.

When authed user sends another request to CouchDB, that supervisor
checks if user_id exists in the table. If it does, it gets the process
PID and updates his timestamp.

Meanwhile the process if locked in receive ... after timeout loop. If
no updates comes from a supervisor, it dies by timeout which means
session got expired. When child dies, supervisor removes a related
record from his ets table.

When user explicitly logs out, session process dies as well before it
reach timeout.

CouchDB provides some admin-only resource like /_active_sessions which
collects userCtx and timestamp from the session processes which are
still alive.

Caveats
=======

Such session list isn't persistent. If supervisor dies, all children
(sessions) dies too. Well, it not a big problem if it's even is. We're
not going to make it solid in anyway because we cannot promise that
this information is precise in because of stateless auth methods.

Session process could die while client with authed user still holds
open connection (by listening changes feed in continuous mode). Here
is need to think a little more about how to link continuous connection
with the another Erlang process and don't let him die before
connection get closed.

If CouchDB serves billion users this solution might not scale well.
However, if such case is yours, you might be on the wave already and
that's not the biggest problem you care about (:

Work for CouchDB 2.0
====================

For cluster CouchDB this feature extends in a way that
/_active_sessions resource additionally returns a node name where user
session was registered. It could be multiple nodes. Some aggregation
here might be needed to reduce duplicity.

Possible Extension
=================

We could also make a session more sensitive on the source from where
user get authed (by IP address, HTTP headers, etc.), warn about
ETOOMANY different authentication locations or limit such amount.
That's could greatly improve a security.

We could go forward and not just record a fact that user had been
logged in, but also hold the last N actions he made: which documents
they read, which they update etc. This information also get stored in
session process. Such kind of feature greatly helps in audit of
current server state.

And so on...

Epilog
======

That's how I see it. Thoughts? Critics?

--
,,,^..^,,,

Mime
View raw message