tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantin Preißer <kpreis...@apache.org>
Subject RE: Special requirements on session id generator
Date Fri, 14 Feb 2014 16:19:32 GMT
Hi Rainer & Mark,

> -----Original Message-----
> From: Rainer Jung [mailto:rainer.jung@kippdata.de]
> Sent: Friday, February 14, 2014 4:32 PM
> To: Tomcat Developers List
> Subject: Re: Special requirements on session id generator
> 
> On 14.02.2014 14:27, Mark Thomas wrote:
> > On 14/02/2014 13:15, Rainer Jung wrote:
> >> I ran into a special requirement for the session id generator
> >> (uniqueness of session id for a longer time interval). While I think
> >> that the requirement needed isn't a useful general purpose extension, I
> >> stumbled over the fact, that the SessionIdGenerator class is not
> pluggable.
> >>
> >> Would it make sense to introduce a new interface for the session Id
> >> generation, probably including setJvmRoute(), setSessionIdLength() and
> >> generateSessionId(), letting the current implementation be the base
> >> implementation, probably switching getRandomBytes() to protected, and
> >> allowing to set the implementation class in ManagerBase - or the Manager
> >> interface (trunk)?
> >>
> >> I haven't worked it out in detail, but it looks easy to do and useful.
> >>
> >> WDYT?
> >
> > I'm struggling to understand the use case. Are you saying the current
> > implementation generates collisions?
> 
> Yes. But only over time, not concurrently. So some session ID used some
> time were again generated on later days. The rate was low, about one out
> of 100.000 IDs were generated again during the next 30 days. I don't see
> a problem here per se, but the application required the IDs to be unique
> over an extended period of time.
> 
> > That would be bad given that
> > SecureRandom is being used.
> 
> Not sure, whether the above rate is totally unexpected. It could have
> used /dev/urandom, so might depend on total system behavior.

I don't think that this is unexpected, even with a "secure" Random generator. My understanding
is that a secure random generator is guaranteed to produce non-predictable output (so it is
OK for security purposes like generating Session IDs), but that shouldn't mean anything about
the "randomness" of the generated bytes, as e.g. the Javadoc says, most implementations are
still pseudo-random, but use a true random seed.

However, even with a true random generator, one can expect to get collisions over time, even
with a 16 byte number (Session-ID). I think this is an application of the birthday problem
/ birthday paradox:
If you chose 23 random people, then the likehood that two of them have the same birthday is
> 50 % (when not considering the birth year).

The Wikipedia article also talks about the cast as a collision problem [2]:
"The birthday problem can be generalized as follows: given n random integers drawn from a
discrete uniform distribution with range [1,d], what is the probability p(n;d) that at least
two numbers are the same? (d=365 gives the usual birthday problem.)"

In the case of a 16 byte session id, "d" would be 256^16 and "n" would be the number of generated
Session-IDs which one wants to monitor for collisions (e.g. if the application produces 10
million session-IDs over 30 days, one could take 10 million for "n" to get the likehood of
a collision of the session ID in 30 days).


> > Ignoring the previous question, why can't the requirement be met with a
> > custom implementation of SecureRandom?
> 
> Just because I haven't thought of that :) Will check. One thing that
> comes to my mind is that the session id generator encodes the random
> bytes. Making IDs more unique for an extended period of time will end up
> replacing some of the random bytes by other data. That limits the
> randomness. I was hoping to compensate for that by using an extended
> setof characters to encode the ID instead of simply hex like it is now.
> So at the end, I can squeeze the additional info plus enough random
> bytes in it. AFAIR that changed encoding would mean I have to change the
> session id generator.

>From my understanding, if you don't watch for collisions, then the number of possible
values for a session-ID (consisting of 16 random bytes) would be 256^16 at any time. However,
if you would like to prevent collisions, then you would need to memorize each previously generated
session-ID, and remove them from the pool of available session-IDs that can be generated.
E.g., for the first Session-ID, the amount of possible values is 256^16, for the second it's
(256^16-1), for the third it's (256^16-2) and so on, until you include some older session-IDs
back into the pool of available session-IDs.

Can you explain what you mean with "using an extended setof characters to encode the ID instead
of simply hex"? AFAIK, a Random Number generator produces random bytes (as the simplest random
value is a bit, so it can produce a series of 2^n random bits). Using hex characters is just
another way to describe those generated bytes.

A simple way that I would use to prevent collisions between a number of generated session-IDs
would be to remember the already generated session-ID, and when a new one is generated that
is the same as an already generated session-ID, just create a new one until it is a really
new one.
That does not quite fit into the mathematical way of generating random numbers without "putting
it back" (I don't know the correct English term for this), as you would have to reduce the
number of possible Session-ID values that will be generated by the RNG, so that instead of
256^16, you can only get 256^16-2000 different session-IDs where the already generated IDs
are not included, but I would think this is far more expensive than just re-generating a session-ID
if the previous was a collision.


Regards,
Konstantin Preißer

> 
> > If the requirement is to be certain that a duplicate session ID is not
> > generated I'd use a custom Manager and check the returned ID against
> > those currently in use request a new one in the highly unlikely event of
> > a duplicate being returned.
> 
> That's already done by our StandardManager. It is not about the ones
> currently in use but about ones that had been used during the last say
> 30 days. There's a database I could check against, but since very few
> "duplicates" seem acceptable I prefer to reduce the number of duplicates
> by adding time information to the custom session id shrinking the window
> of possible duplicates.
> 
> Regards,
> 
> Rainer
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org


[1] http://en.wikipedia.org/wiki/Birthday_problem
[2] http://en.wikipedia.org/wiki/Birthday_problem#Cast_as_a_collision_problem


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Mime
View raw message