tomcat-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Costin Manolache <cos...@eng.sun.com>
Subject Re: SimpleMapper2
Date Thu, 29 Jun 2000 16:38:17 GMT
> (*) About Craig McClanahan reply: I had no time to look the SimpleMapper1 code after I
read your email, but I think it isn't matching that way now. I agree about matching the largest
context path altough I'm not sure the servlet spec says something about that. See my analysis
of 1).

I think SimpleMapper1 does match the right way ( at least it does the "longest" match )

> I will start with an analysis for 1):
>
>     SimpleMapper1 will try to match the requestURI against the big hash. Because the
big hash could be considered not being so big to scare that hash function, I think this would
essentially be a hash function evaluation plus a string comparison. This could result in a
match (the path info is probably short or inexistent so we will have a match in a few steps)
but we don't know if we already have the largest context path. A situation in which there
is a number of contexts with only 1 path element, and one or two contexts with two or three
path elements doesn't seem very uncommon. So we will probably have a one path element context
at this point and we will have to continue with the search just to be sure we wouldn't have
had a match with a larger context. Then, the last path element of the requestURI is truncated
and we restart the process. We will have to continue the search until we have a requestURI
that is as large as the largest context path we already found. In most cases this will be
something like n-1 hash function evaluations plus string comparisons. The average length of
the strings will be something like length(requestURI)/2. This affects the hash function calculation
and the string comparison.

I don't think so - we know we have the largest context path as soon as we have the first match
!
The key of the hashtable is ContextPath + ServletPath.
A certain path /my/ctx/path/servlet/path ( with /my/ctx/path == context path ) can only be
mapped to
a certain combination context/servlet - and that's known _before_ mapping, when the structure
is created.

That mean you pre-compute longest context path / longest prefix path ( which is the result
of the match ).

For a request with no PathInfo you'll do exactly 1 hashtable lookup.
If you have PathInfo you'll have N lookups, with N==number of components of the
PathInfo.

( if the code is different - it's a bug, at least that was the goal in SimpleMapper1)

> I will continue with 2):
>
>     Suppose we know c, the largest context path length in this container (this is a cheap
number). We extract the first path element of the request URI, then we match it against the
contexts hash. Consider the same situation that before: there is a number of contexts with
only 1 path element, and one or two contexts with two or three path elements. If we have a
first match, we will extract the second path element of the request URI and concatenate it
to the first one (or something equivalent). We will try to match this new path. If we fail,
the first was the right context. If we success, it is likely that we have a context path as
large as c (c is 2 or 3, and I think I'm being prudent here), so we can stop the search. 
Then, we will try to match the remaining requestURI against the found context's servlet mappings
hash. We proceed in the same way as in 1). Again, the path info is probably short or inexistent
so we will have a match in a few steps and we could stop then because we won't have the context
path problem. From this I would say that we are doing less comparisons with shorter strings
(this means cheaper hash functions evaluations and string comparisons). I think that the hash
sizes aren't relevant because the same in 1) than in 2) they aren't big enough to scare the
String hash function and it's very likely that we will have a one-string bucket after its
application (it's only the "idea", I'm not depending in hashes being open or closed) in both
situations.

First, what you describe as context matching is in fact a Tree search.

The problem is that you'll do extra hashtable lookups to match the context - (1) has only
one lookup if no pathinfo is present.

BTW, a nice optimization in SimpleMapper1 is that the result is cached - so first it
looks up in a match  mapCache, so even if you have path info you still have 1
lookup after the first match.  ( and of course, for a large site we need to implement
a LRU and remove mapping entries that are not used).

LRU and real caching is one of my biggest goals for next release of tomcat
( not only in the mapping cache, but also removing unused servlets, saving on
disk whatever can be saved to free memory, etc)

Costin




Mime
View raw message