Return-Path: Delivered-To: apmail-cocoon-dev-archive@cocoon.apache.org Received: (qmail 50610 invoked by uid 500); 31 Jul 2003 15:49:08 -0000 Mailing-List: contact dev-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: list-post: Reply-To: dev@cocoon.apache.org Delivered-To: mailing list dev@cocoon.apache.org Received: (qmail 50574 invoked from network); 31 Jul 2003 15:49:08 -0000 Received: from unknown (HELO linux.local) (213.140.9.77) by daedalus.apache.org with SMTP; 31 Jul 2003 15:49:08 -0000 Received: from apache.org (localhost [127.0.0.1]) by linux.local (Postfix) with ESMTP id D592084D36 for ; Thu, 31 Jul 2003 17:49:08 +0200 (CEST) Message-ID: <3F293A74.9070007@apache.org> Date: Thu, 31 Jul 2003 17:49:08 +0200 From: Gianugo Rabellino User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.3) Gecko/20030312 X-Accept-Language: en-us, en MIME-Version: 1.0 To: dev@cocoon.apache.org Subject: Re: Flow's processPipelineTo() and FileSource References: <3F292924.10908@apache.org> <3F2937E3.3040408@anyware-tech.com> In-Reply-To: <3F2937E3.3040408@anyware-tech.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N Sylvain Wallez wrote: >> I'm having a hell of a time using flow with processPipelineTo() and >> OutputStreams coming out from FileSource(s). >> >> The problem is that FileSource#getOutputStream() creates a temporary >> file (... to be discussed later ...) and such file gets renamed to the >> original one only upon OutputStream.close(). Now, AbstractInterpreter, >> line 201, actually calls flush() but *never* close. As a result, >> everything is kinda ... well... screwed up. >> >> Patch is trivial, but I'm wondering if adding out.close() in >> AbstractInterpreter.java might break something: any flow experts around? > > I don't see why there should be some consequences on the flow itself... > Just replace flush() by close() ! Just did it, but I didn't replace flush(), just added close() afterwards: it's better to be sure that there are no leftovers... >> Now for the FileSource: I do understand *some* of the reasoning behind >> using a temporary file, but I have to disagree on the implementation: >> naming it [filename].tmp is a bit of a bet, since someone might >> legitimately have such a filename around. While I understand that >> there might be memory issues with large files, I guess that either: >> >> 1. keeping a ByteArrayOutputStream; >> 2. forget about it and just write the file; >> 3. use a more "clever" name that doesn't risk conflicts this much > > > > I would avoid 2. The reason why I used a temporary file is because of > the streamed nature of Cocoon pipelines. If an error occurs within the > processing, the original content is not partially overwritten. My > preference would go to 3. I see and understand. Yet temporary files, besides being somehow inconvenient, can be a major security hole in general. I'd rather go for 1, then, accumulating bytes as they come on a ByteArrayOutputStream and writing them upon close() (and maybe flush() too?). True, this is in turn a possible security hole since someone might DOS the machine by processing gigabyte-sized files, but all in all I tend to think that it's a better solution... and yes, doing transaction on a filesystem is a PITA. :-) Ciao, >> are all better options. >> >> Is that OK to you if I work on it? I don't know if I have access to >> the Excalibur CVS though... > > As a Cocoon committer, you should. I understand that I am authorized in line of principle, just don't know if I need to be explicitely enabled. Anyway, I'll check it out. :-) Thanks for everything, -- Gianugo Rabellino Pro-netics s.r.l. - http://www.pro-netics.com Orixo, the XML business alliance - http://www.orixo.com (Now blogging at: http://blogs.cocoondev.org/gianugo/)