Mailing-List: contact cocoon-dev-help@xml.apache.org; run by ezmlm
Precedence: bulk
Reply-To: cocoon-dev@xml.apache.org
Message-ID: <20021017192355.62973.qmail@web40804.mail.yahoo.com>
Date: Thu, 17 Oct 2002 12:23:55 -0700 (PDT)
From: Geoff Howard <cocoongeoff@yahoo.com>
Subject: RE: file upload follow up
To: cocoon-dev@xml.apache.org
In-Reply-To: 
 <484A6CA492BE654395D208B1D8D53939626C6E@SOMEXEVS001.ex.ordersx.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii

Alfred, 

--- Nathaniel Alfred <Alfred.Nathaniel@swx.com> wrote:
...
 
> >1) There is the idea that choosing behaviour
> >per-pipeline would be the best.  Looking into this
> >some more, I'm not sure it's feasible to do so. 
...
> >I think this same issue makes it impractical to
> move
> >these configurations into cocoon.xconf, which I had
> >hoped to do.  Some of the suggestions below make me
> >more comfortable with this though.
> 
> Making it per-pipeline is not possible because the
> pipeline selection should to be able to assume that 
> all request parameters are available.

Right.

> The HttpServletRequest coming from the servlet
> engine
> has already decoded the URL and parameters from the
> query string parameters.  That is enough information
> to search cocoon.xconf for a suitable request
> factory.
I haven't had a chance to take a detailed look at how
that would work, but it sounds like it would mean
parsing cocoon.xconf before its normal processing just
to get those parameters.  I'd like to know how it'd be
perceived by the committers before investing any
effort - is there any precedent?  I don't feel very
strongly about the need for it - especially since it's
configuring behind the scenes behaviour that probably
will not need to be modified in the vast majority of
cases.
 
> For example, in a virtual hosting setup one could
> then
> configure different max-upload-size depending on the
> port number or URL prefix.
I'm not sure I see the need here.  If the different
vhosts have different webapps, they each have their
own web.xml.  If they are handled by one cocoon
instance, they can configure their own acceptible size
in the FileUploadAction, with the upper limit
constrained by the global setting in web.xml, which
allows the "master host" to set a reasonable limit for
the vhosts.  I see the web.xml setting primarily as a
safety measure to keep me from constructing an
app-killer maliciously large file upload.

> Sure it is not nice to put stuff into cocoon.xconf
> which
> conceptually belongs into the sitemap.  Maybe we
> should
> have a mechanism to override cocoon.xconf defaults
> with 
> vhost specific configurations.
I think this kind of idea is worth coming back to at a
later iteration.  Most of what is needed will be
available via actions as it stands now.  Given that
sitemap parameters/modules are a work in progress and
should have some standard mechanisms for that kind of
concept, it seems wise to wait to see where those
chips fall.
 
> >2) There was a suggestion to make the uploaded file
> >saved to disk temporary - that is have the
> >FilePartFile removed at the end of the request.  I
> >think this is a good idea if it's a configurable
> >option via web.xml.  What do others think?
> 
> Why should it be configurable to shoot yourself into
> the
> foot?  If the framework is not cleaning up, who will
> before your disk becomes full?
Well, _I_ obviously think it's a bad idea to leave
those files around, but I recommend configuration for
that behavior for two reasons:
1) It's been the default behavior now for some time
and who knows what applications depend on it (without
rewrite).  It doesn't need to be the default going
forward.
2) People are used to having "enough rope to hang
themselves", at least in the *nix world - I don't see
a major reason not to give it to them in this case. 
Just like allow-reload is a very bad idea on a
production system, but useful on dev, I can imagine
the same thing here.

...
> We should have some SoC here.  It only costs one
> additional
> file copy operation for the special use case of a
> public
> upload directory.
... 
I understand all that (including some snipped for
length), but still feel the dev situation especially
makes configuration of the auto-delete option wise. 
By the way, I don't see adding another init-param. 
I'd propose adding an additional option to
auto-upload: true, false, and "temp" or something like
that.

> 
> >3) The FileUploadAction is a good idea (IMHO, and
> it
> >exists but not in CVS).  This would enable users to
> >choose an authenticated pipeline to deal with file
> >uploads, specifying via sitemap parameters where to
> >store the file, what to name it, and a maximum
> >accepted file size.  A similar action could store
> the
> >file as a BLOB if desired.  The name of this action
> is
> >deceiving since the file will already be on disk
> >(temporarily if the second point above is followed)
> or
> >in memory.  
> 
> FileUploadAction is mainly just the replacement for
> the 
> functionality lost by using tempory files in
> FilePartFile. 

Exactly.

> >4) Multiple file uploads though mentioned in the
> RFC,
> >don't seem to be supported in browsers.  Am I
> wrong?  
> >Constructing an HTTP request by hand using the
> >proposed structure results in no file upload and no
> >error logged, the same as other malformed requests
> >that I tried.  In any case, the current
> functionality
> >works fine with multiple file fields: 
> ><input type="file" name="file1">
> ><input type="file" name="file2">
> >I'm happy with this behavior.
> 
> Me too.

OK. 

> >5) The maximum file size default in web.xml works
> for
> >"honest" requests, but I only did cursory tests for
> >requests constructed to deceive.  The logic in
> >MultipartParser relies on
> request.getContentLength(). 
> >There may be logic elsewhere to ensure that
> >getContentLength is accurate.  Can someone check
> that?
> 
> The client has to tell in the Content-Length header,
> how many bytes follow in the request body.  If the
> client
> lies or the server does not read out the socket
> properly,
> the requests following in the HTTP/1.1 will not be
> recognizable.

Right.  What I'm worried about is how that case is
handled.  I'm pretty sure that this is the exact type
of vulnerability that was used to win OpenHack1
(http://www.eweek.com/article2/0,3959,600435,00.asp). 
Granted, Java probably has far fewer exploitable holes
by its nature, but it's certainly not bullet proof.

> That is no bother if the requests come all from the
> same user.
> But what if an upstream proxy channels requests from
> different
> users through the same connection?  (I don't know if
> mod_proxy 
> or mod_jk do that.)
Don't know either - anyone know or up to finding out?

...

> >
> >Anyone still interested in all this?
> >
> >Geoff
> 
> Sure, I hope your fixes make it into 2.1 or even
> 2.0.x.
> 
> Cheers, Alfred.

Thanks, me too. :) I'm pretty sure the same changes
work against 2.0.3, but haven't checked it out and
haven't created a separate patch for that (don't know
if it'll be needed).

If the changes do make it in, and the "temp" file
option also is accepted, I'll volunteer to rework the
file upload example, and submit a little how-to on
handling file uploads explaining the configuration,
security issues, etc.  

Geoff

__________________________________________________
Do you Yahoo!?
Faith Hill - Exclusive Performances, Videos & More
http://faith.yahoo.com

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org