Return-Path: Delivered-To: apmail-xml-cocoon-dev-archive@xml.apache.org Received: (qmail 77801 invoked by uid 500); 17 Oct 2002 19:23:50 -0000 Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: cocoon-dev@xml.apache.org Delivered-To: mailing list cocoon-dev@xml.apache.org Received: (qmail 77785 invoked from network); 17 Oct 2002 19:23:50 -0000 Message-ID: <20021017192355.62973.qmail@web40804.mail.yahoo.com> Date: Thu, 17 Oct 2002 12:23:55 -0700 (PDT) From: Geoff Howard Subject: RE: file upload follow up To: cocoon-dev@xml.apache.org In-Reply-To: <484A6CA492BE654395D208B1D8D53939626C6E@SOMEXEVS001.ex.ordersx.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Alfred, --- Nathaniel Alfred wrote: ... > >1) There is the idea that choosing behaviour > >per-pipeline would be the best. Looking into this > >some more, I'm not sure it's feasible to do so. ... > >I think this same issue makes it impractical to > move > >these configurations into cocoon.xconf, which I had > >hoped to do. Some of the suggestions below make me > >more comfortable with this though. > > Making it per-pipeline is not possible because the > pipeline selection should to be able to assume that > all request parameters are available. Right. > The HttpServletRequest coming from the servlet > engine > has already decoded the URL and parameters from the > query string parameters. That is enough information > to search cocoon.xconf for a suitable request > factory. I haven't had a chance to take a detailed look at how that would work, but it sounds like it would mean parsing cocoon.xconf before its normal processing just to get those parameters. I'd like to know how it'd be perceived by the committers before investing any effort - is there any precedent? I don't feel very strongly about the need for it - especially since it's configuring behind the scenes behaviour that probably will not need to be modified in the vast majority of cases. > For example, in a virtual hosting setup one could > then > configure different max-upload-size depending on the > port number or URL prefix. I'm not sure I see the need here. If the different vhosts have different webapps, they each have their own web.xml. If they are handled by one cocoon instance, they can configure their own acceptible size in the FileUploadAction, with the upper limit constrained by the global setting in web.xml, which allows the "master host" to set a reasonable limit for the vhosts. I see the web.xml setting primarily as a safety measure to keep me from constructing an app-killer maliciously large file upload. > Sure it is not nice to put stuff into cocoon.xconf > which > conceptually belongs into the sitemap. Maybe we > should > have a mechanism to override cocoon.xconf defaults > with > vhost specific configurations. I think this kind of idea is worth coming back to at a later iteration. Most of what is needed will be available via actions as it stands now. Given that sitemap parameters/modules are a work in progress and should have some standard mechanisms for that kind of concept, it seems wise to wait to see where those chips fall. > >2) There was a suggestion to make the uploaded file > >saved to disk temporary - that is have the > >FilePartFile removed at the end of the request. I > >think this is a good idea if it's a configurable > >option via web.xml. What do others think? > > Why should it be configurable to shoot yourself into > the > foot? If the framework is not cleaning up, who will > before your disk becomes full? Well, _I_ obviously think it's a bad idea to leave those files around, but I recommend configuration for that behavior for two reasons: 1) It's been the default behavior now for some time and who knows what applications depend on it (without rewrite). It doesn't need to be the default going forward. 2) People are used to having "enough rope to hang themselves", at least in the *nix world - I don't see a major reason not to give it to them in this case. Just like allow-reload is a very bad idea on a production system, but useful on dev, I can imagine the same thing here. ... > We should have some SoC here. It only costs one > additional > file copy operation for the special use case of a > public > upload directory. ... I understand all that (including some snipped for length), but still feel the dev situation especially makes configuration of the auto-delete option wise. By the way, I don't see adding another init-param. I'd propose adding an additional option to auto-upload: true, false, and "temp" or something like that. > > >3) The FileUploadAction is a good idea (IMHO, and > it > >exists but not in CVS). This would enable users to > >choose an authenticated pipeline to deal with file > >uploads, specifying via sitemap parameters where to > >store the file, what to name it, and a maximum > >accepted file size. A similar action could store > the > >file as a BLOB if desired. The name of this action > is > >deceiving since the file will already be on disk > >(temporarily if the second point above is followed) > or > >in memory. > > FileUploadAction is mainly just the replacement for > the > functionality lost by using tempory files in > FilePartFile. Exactly. > >4) Multiple file uploads though mentioned in the > RFC, > >don't seem to be supported in browsers. Am I > wrong? > >Constructing an HTTP request by hand using the > >proposed structure results in no file upload and no > >error logged, the same as other malformed requests > >that I tried. In any case, the current > functionality > >works fine with multiple file fields: > > > > > >I'm happy with this behavior. > > Me too. OK. > >5) The maximum file size default in web.xml works > for > >"honest" requests, but I only did cursory tests for > >requests constructed to deceive. The logic in > >MultipartParser relies on > request.getContentLength(). > >There may be logic elsewhere to ensure that > >getContentLength is accurate. Can someone check > that? > > The client has to tell in the Content-Length header, > how many bytes follow in the request body. If the > client > lies or the server does not read out the socket > properly, > the requests following in the HTTP/1.1 will not be > recognizable. Right. What I'm worried about is how that case is handled. I'm pretty sure that this is the exact type of vulnerability that was used to win OpenHack1 (http://www.eweek.com/article2/0,3959,600435,00.asp). Granted, Java probably has far fewer exploitable holes by its nature, but it's certainly not bullet proof. > That is no bother if the requests come all from the > same user. > But what if an upstream proxy channels requests from > different > users through the same connection? (I don't know if > mod_proxy > or mod_jk do that.) Don't know either - anyone know or up to finding out? ... > > > >Anyone still interested in all this? > > > >Geoff > > Sure, I hope your fixes make it into 2.1 or even > 2.0.x. > > Cheers, Alfred. Thanks, me too. :) I'm pretty sure the same changes work against 2.0.3, but haven't checked it out and haven't created a separate patch for that (don't know if it'll be needed). If the changes do make it in, and the "temp" file option also is accepted, I'll volunteer to rework the file upload example, and submit a little how-to on handling file uploads explaining the configuration, security issues, etc. Geoff __________________________________________________ Do you Yahoo!? Faith Hill - Exclusive Performances, Videos & More http://faith.yahoo.com --------------------------------------------------------------------- To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org For additional commands, email: cocoon-dev-help@xml.apache.org