perl-modperl mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From André Warnier ...@ice-sa.com>
Subject Re: random token re-used in subsequent requests
Date Tue, 17 May 2016 13:10:37 GMT
On 17.05.2016 14:11, Vincent Veyron wrote:
> On Tue, 17 May 2016 10:16:43 +0200
> André Warnier <aw@ice-sa.com> wrote:
>>
>> I don't see above any signifiant difference in configuration between the servers,
apart
>> from the fact that the "faulty" server runs a 64-bit version of perl.
>
> Sorry : slightly digressive rant about the fact that every time I compare my configs,
I find some subtle differences. Should be getting into config management tools, but that takes
time too.
>
>>
>> Now I also found this :
>>     http://rabexc.org/posts/randomizing-should-be-easy-right-oh
>>
>> I am not sure that I really understand this all the way down, but would this not
be a
>> suspect in a case where the behaviour seems different between one 64-bit machine,
and a
>> bunch of 32-bit ones ?
>
> Nope; same results on both types when running the script
>
>>
>> This being said, it still looks to me as if the current code is flawed on *all* machines,
>> and *will* repeat keys quite often. It just depends again on the exact sequence of
>> requests hitting a specific Apache, and the other parameters I mentioned before.
>> I still believe that the fact that it does not *seem* to happen, is just due to the
>> inherent randomness of these other factors on the production machines.
>>
>
> Well, I already posted a test with ab and 12 000 requests, so not sure about the 'quite
often' part?
>
> This is on the faulty one :
>
> xxxx@arsene:~$ perl -le '%h=();for (1..10_000_000) {my $session_id = join "", map +(0..9,"a".."z","A".."Z")[rand(10+26*2)],
1..32;$h{$session_id}=1};$v=keys %h; print $v'
> 10000000
>
>

Yes, but this is *one* process. Each independent process, if you consider the keys, will 
get a succession of different answers from rand(), and thus generate different keys.
But if n different processes were all starting with the same initial seed, they would all

generate the *same* sequence of rand() responses, and the same sequence of keys.
And that is what I am saying : each of your Apache pre-fork children is a separate 
process, but they all always start with the same random seed.
So they will all, ultimately, generate the same sequence of keys (but not necessarily at 
the same time).

Let's say that there are initially 5 Apache children, and that Apache never starts more 
than 5.
Now you start bombarding the server with hundreds of requests, all of them triggering the

key-generation mechanism.
And let's say that it takes your module 1 s. to respond to a request (just to make things

simpler below).

T0 :
Request #1 comes in.
The main Apache looks for a free child, and finds child #1.
It passes request #1 to child #1.
This child will be busy until T0 + 1s.

T0 + 0.1s :
Request #2 comes in.
The main Apache looks for a free child.
Child #1 is still busy, so it finds child #2.
It passes request #2 to child #2.
This child will be busy until T0 + 2s.

and so on..
(child 5 is now busy until T0 + 5 s.)

Request #6, at T0 + 0.6s) :
Now all 5 children are busy, and Apache has to wait with the request, until
one child becomes free (*).
In this very simplified case, it will be child #1, at T0 + 1s.

At T0 + 1s, child #1 becomes free again. So child #1 now gets request #6, which for him is

only the *second* request that it processes.
So it generates *its* key #2 (which globally is the generated key #6).

In this very simplified example, the first 5 keys generated globally by Apache will be 
identical, because each child starts with the same seed, and they are all called neatly in

a regular sequence.
And then the next 5 keys will be identical, because for each child it is now the second 
request.
And so on.

But in a real situation :
- not all requests come in so neatly at regular intervals
(so for example child #1 may become free, before child #5 is even called once).
- not all requests take the same time to serve (other things happen on the server)
- not all requests generate a key (so if child #4 is called but does not call rand(), it 
does not count; or if it calls rand() only 5 times instead of 32, that screws up the whole

sequence, and it will now start generating keys that are different from all the others)
- the number of children will vary over time. New ones will be created as needed,
some older ones will die and be replaced by a brand-new one. Each time that happens, the 
new child will start with key #1 again, because it jus got a brand-new perl. While at the

same time, there may still be an older child alive, for which the next key is already 
number 5000 in its own sequence.
- etc..
And this "disorder" will tend to be larger, the more loaded is that server.
So over any given period of time, each child will tend to be at a different stage in his 
rand() calls. And the risk of having the same key being returned to 2 clients at about the

same time, is relatively low.
But if the keys are stored somewhere in a persistent way, you are increasing the risk 
greatly, because key #13 generated by a new child today, may conflict with the key #13 
generated by another child yesterday.

(*) or start an additional child


Mime
View raw message