Return-Path: Delivered-To: apmail-jakarta-tomcat-dev-archive@jakarta.apache.org Received: (qmail 91447 invoked by uid 500); 30 Sep 2001 06:18:06 -0000 Mailing-List: contact tomcat-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: tomcat-dev@jakarta.apache.org Delivered-To: mailing list tomcat-dev@jakarta.apache.org Received: (qmail 91436 invoked from network); 30 Sep 2001 06:18:05 -0000 Date: Sun, 30 Sep 2001 01:23:24 -0700 (PDT) From: X-X-Sender: To: , Bill Barker Subject: Re: Volunteers for: - RE: TC 3.3: getRequestURI() In-Reply-To: <008b01c14953$d45d44c0$5a66a8c0@wilshire.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N On Sat, 29 Sep 2001, Bill Barker wrote: > It seems that I must have been bad in a past life, since my Karma isn't high > enough.:) > > I've added the code to re-encode the URL to DecodeInterceptor on my machine. > If you want it right away, I can post a diff. Hi, Could you send the diff, I'll have to merge it with my changes anyway... ( I hope you found UEncoder and used it, because that's what I did ). I am now thinking about how to encode the context path - which is more difficult than I tought. The problem is of course that we don't know the charset in many cases, and Context.getPath() returns the UTF version. If we encode this - it may be inconsistent with the original request encoding. So I'll try to count the '/' and return a substring of the uri - I can't think of any better way. Of course, I have no idea why the contextPath has to be encoded - poor people using the contextPath as a key will have a bad surprise since you can have multiple representations for the same context ( based on the charset of the request ), but as usual we ( 8859_1 uses ) are ok. I can leave this the way it is, as it makes more sense ( and 2.2 doesn't seem to require the context path to be encoded) - and wait until 4.0 fixes that ( beeing consistent with 2.3 doesn't seem a good idea in this case ). I'm not sure, need to look deeper at the specs and impl. Costin > ----- Original Message ----- > From: > To: > Sent: Friday, September 28, 2001 11:17 AM > Subject: Volunteers for: - RE: TC 3.3: getRequestURI() > > > > > > It seems most agree on using 'decoded' URI in mod_jk. Making the change > > is not easy, there are few places where we need to coordinate and make > > sure we're on the same page. > > > > I don't think I can do this alone ( if it sounded like I volunteer to fix > > it - well, I need help ). > > > > Problems: > > - Someone with IIS must cut&paste the decoding stuff from Apache ( > > probably in jk/common ), make sure the uri sent is decoded ( so consistent > > with Apache and NES ). That should happen in j-t and j-t-c ( with this > > ocasion we'll help Marc a bit :-) > > > > - One piece is to implement the java side of the decoding. I can do that > > if nobody else wants ( I have few other bugs in work, so I'll probably do > > it tommorow ). > > > > - I'll fix DecodeInterceptor to avoid double decoding ( I'm already fixing > > the normalization for JNI ). > > > > - Someone should check 4.0. Strange, even if this is a 2.3 requirement I > > didn't see any comment so far... Well, they have cool features and jars to > > add, so I can do that if nobody else does. > > > > - Revert jk/apache to use uri, remove the encode call ( again, j-t and > > j-t-c - one more week to do that, after that we'll be j-t-c only ). Henri > > - could you do this and the next one ? > > > > - Build and make some jars available - so we can test. > > > > - Test. > > > > Yes, it's a long list - but at the end we might solve one of the trickiest > > problems. > > > > Costin > > > > > > > > > >2. mod_jk will send the 'decoded' URI ( %xx replaced with the real > > > >char ). > > > > > > > >On IIS - we need to decode the URI, Apache+NES - nothing to do. > > > >On java side - we do a 'canonical' encoding in the facade. All > > > >the code will operate on the decoded request ( this is what > > > >Apache and NES are doing ). We also need to prevent DecodeInterceptor > > > >to re-decode the URIs from jk. ( that's trivial, just a flag ) > > > > > > > >Benefits: > > > >- consistency with Apache, all processing on decoded uris. > > > >- easier to maintain ( java :-) > > > >- important - servlets will get a consistently encoded uri, > > > >thus preventing many security problems. With the current code > > > >many tricks can be played ( see recent security problems in > > > >tomcat ) using encoding - if we were vulnerable to that, > > > >I suspect most servlet authors will be as well. > > > > > > > >Problems: > > > >- a bit more work to do. > > > >- the 'original' uri will not be preserved in any servers ( > > > >the first solution allows that for IIS and standalone ). > > > > > > > > > > > >Your votes please, I'm ok with any of them ( with a slight > > > >preference to 2 ) > > > > > > > >Costin > > > > > > > > > > > >On Thu, 27 Sep 2001, Larry Isaacs wrote: > > > > > > > >> > > > >> > > > >> > -----Original Message----- > > > >> > From: cmanolache@yahoo.com [mailto:cmanolache@yahoo.com] > > > >> > Sent: Thursday, September 27, 2001 3:10 AM > > > >> > To: tomcat-dev@jakarta.apache.org > > > >> > Subject: RE: TC 3.3: getRequestURI() > > > >> > > > > >> > > > > >> > Given this is an important change - and something will be broken > > > >> > regardless of what we do - I think we need to coordinate > > > >and make sure > > > >> > we're doing it right. > > > >> > > > > >> > - First: Larry - what do you think ? We just had RC1, and we > > > >> > have already > > > >> > a simple patch ( changing SessionId to hide the problem ). > > > >My proposal > > > >> > is simple to implement ( just encode the URI on the facade, and use > > > >> > only decoded URIs internally ), but it is braking some of the 2.3 > > > >> > clarifications ( not mandatory for 2.2, of course, but important ) > > > >> > > > >> I'm leaning towards your encode in facade solution. I'm > > > >curious about > > > >> the 2.3 clarifications you are referring to beyond the URI being the > > > >> "original". > > > >> > > > >> > > > > >> > - Someone with access to NES and/or IIS, could you please verify > > > >> > if the requestUri variable in NSAPI/ISAPI is encoded or > > > >not ? Neither > > > >> > of them seems to provide the choice between unencoded_uri and uri, > > > >> > so whatever they provide is the only thing we can use. > > > >> > > > >> I can try IIS easily enough. I'll also try to get NES running and > > > >> see if I can determine this one too. I'll need to do this at home, > > > >> so I'll report my results tonight. > > > >> > > > >> Larry > > > >> > > > >> > > > > >> > I think the result of the test with IIS/NES is essential > > > >to resolving > > > >> > this problem once and for all. If the URI they provide is the > > > >> > 'original/or > > > >> > encoded' - that's what we should use on Apache side. > > > >> > > > > >> > If not ( and the URI is decoded ) - that means the 'original uri' > > > >> > is un-implementable, and we shouldn't worry about it anymore, and > > > >> > using the decoded URI consistently is the best solution. > > > >> > > > > >> > > > > >> > Please, (I know there aren't too many windows user around :-), > > > >> > could someone check this ? > > > >> > > > > >> > Costin > > > >> > > > > >> > > > > > > > > > > > > > > *----* > > This message is intended only for the use of the person(s) listed above > as the intended recipient(s), and may contain information that is > PRIVILEGED and CONFIDENTIAL. If you are not an intended recipient, > you may not read, copy, or distribute this message or any attachment. > If you received this communication in error, please notify us immediately > by e-mail and then delete all copies of this message and any attachments. > > > In addition you should be aware that ordinary (unencrypted) e-mail sent > through the Internet is not secure. Do not send confidential or sensitive > information, such as social security numbers, account numbers, personal > identification numbers and passwords, to us via ordinary (unencrypted) > e-mail. >