Return-Path: Delivered-To: apmail-cocoon-users-archive@www.apache.org Received: (qmail 5630 invoked from network); 15 Aug 2010 10:34:29 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 15 Aug 2010 10:34:29 -0000 Received: (qmail 30708 invoked by uid 500); 15 Aug 2010 10:34:28 -0000 Delivered-To: apmail-cocoon-users-archive@cocoon.apache.org Received: (qmail 30190 invoked by uid 500); 15 Aug 2010 10:34:25 -0000 Mailing-List: contact users-help@cocoon.apache.org; run by ezmlm Precedence: bulk list-help: list-unsubscribe: List-Post: Reply-To: users@cocoon.apache.org List-Id: Delivered-To: mailing list users@cocoon.apache.org Received: (qmail 30179 invoked by uid 99); 15 Aug 2010 10:34:24 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Aug 2010 10:34:24 +0000 X-ASF-Spam-Status: No, hits=2.2 required=10.0 tests=FREEMAIL_FROM,HTML_MESSAGE,SPF_PASS,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of thomas.ernest@gmail.com designates 74.125.82.49 as permitted sender) Received: from [74.125.82.49] (HELO mail-ww0-f49.google.com) (74.125.82.49) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 15 Aug 2010 10:34:15 +0000 Received: by wwi14 with SMTP id 14so3875513wwi.6 for ; Sun, 15 Aug 2010 03:33:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from:reply-to :user-agent:mime-version:to:subject:references:in-reply-to :x-enigmail-version:content-type; bh=LgMd8umQSmBlmL9Hwqv27BJqjlT+Eum7Wq2FphTJHBU=; b=b84W/V51Od7XiPijpbaemZVP8HIkyNPeWEqIOmNZaPiFEGurIlI5plzPv4Tm55aZtg 3cWtCT/DYVogZxHmei/uniFMh+GtnfwpoMwaGtfe3nJe4bi9ynpRaK3Ssctm3mTvvX6V AMjQV8jX5DN4dMn2aQCWZUqc/IkSfT/iFtIE4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:reply-to:user-agent:mime-version:to:subject :references:in-reply-to:x-enigmail-version:content-type; b=IjkGuvZ3XNh2krzqMKjfc8ZBZIy2lHiIDxHl5rItD92z2QWMxbC4SvpueZSpZyzONC K0hXs2R3R9+NtFgFaNe0Ih4Z0mIJQ4B5aVdc9mf0FIw9lhpAnRCxhtAyBALx7rB4qcY/ C4RAanMhPuf9crPWEb2wnEMTb97nnZ8jDQSYU= Received: by 10.227.144.4 with SMTP id x4mr3462132wbu.59.1281868434788; Sun, 15 Aug 2010 03:33:54 -0700 (PDT) Received: from [127.0.0.1] (acrx187.neoplus.adsl.tpnet.pl [83.11.25.187]) by mx.google.com with ESMTPS id a28sm4269026wbe.9.2010.08.15.03.33.53 (version=SSLv3 cipher=RC4-MD5); Sun, 15 Aug 2010 03:33:54 -0700 (PDT) Message-ID: <4C67C290.40507@gmail.com> Date: Sun, 15 Aug 2010 12:33:52 +0200 From: Thomas Ernest Reply-To: thomas.ernest@gmail.com User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; fr; rv:1.9.2.8) Gecko/20100802 Thunderbird/3.1.2 MIME-Version: 1.0 To: users@cocoon.apache.org Subject: Re: can't find files with spaces in their names References: In-Reply-To: X-Enigmail-Version: 1.1.1 Content-Type: multipart/alternative; boundary="------------080502070802060409060804" X-Virus-Checked: Checked by ClamAV on apache.org --------------080502070802060409060804 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Hello Stephen, > I'm using cocoon 2.2. > > Background: > I'm trying to build a cocoon presentation layer on top of an > subversion repository. A user makes a request and the cocoon > application checks to see if a directory index needs to be added, adds > menus and navigation trails, handles mobile formatting etc but does a > ci:includexml to get the content of the page from the subversion > repository. > It seems very interesting :-) > Problem: > Some of the files in the subversion repository have spaces in the file > names (and it's not feasible to have a repository without any URL > encoding required characters) which are decoded (by the matcher?) and > then passed off to the reader which makes a request of the source > resolver without re-encoding the spaces (I think). > > I assume this is what is happening because I get a FileNotFound error > when using with the src set stuff from the request > generator but it works when I use xslt to replace the spaces with > %20s. I can fix the problem when constructing pages using xslt but > not for images and other stuff that needs to be . > > Am I going to have to write my own reader that does URL encoding? >From my point of view, you don't have to write a reader to handle you data/link. Because the (right) way to communicate with a server it with URL, which involves URL encoding. It seems that you want to communicate with your server, as with a standard file system. Since it isn't a file system, you have to adapt you to web and web server requirements, i.e. URL encoding. I guess you use svn list to know what contains your svn repository. (Please give us more details about this part) It must be interpreted via a BASH at some point. So you should also use this BASH to translate your listing into a "URL encoded" listing. I saw that sed with a file urlencode.sed [1] can be very useful to reach this goal : svn list | sed -f urlencode.sed Finally, you should find a way before using , so that you provide your server with data respecting web requirements/rules. There are several ways : 1/ Encode URL via your BASH and the command sed. 2/ Use a XSLT style sheet to transform every occurrence of char forbidden in URL string. 3/ Use EncodeURLTransformer. I'm not sure it releases what you want, but its name seems to be interesting ;-) [2] 4/ Use POST data, because they likely don't require URL encoding. 5/ ... ? There are many ways to fix your problem and I haven't provide you with a exact fix, sorry. Whatever I wrote, I hope you understand that you have to provide your server with URL encoded URLs. I mean the solution isn't in developing a reader, but more in using or developing a transformer to encode you URL or even in providing your system with URL encoded data since the beginning. Good luck. Thomas. [1] urlencode.sed s/%/%25/g s/ /%20/g s/ /%09/g s/!/%21/g s/"/%22/g s/#/%23/g s/\$/%24/g s/\&/%26/g s/'\''/%27/g s/(/%28/g s/)/%29/g s/\*/%2a/g s/+/%2b/g s/,/%2c/g s/-/%2d/g s/\./%2e/g s/\//%2f/g s/:/%3a/g s/;/%3b/g s//%3e/g s/?/%3f/g s/@/%40/g s/\[/%5b/g s/\\/%5c/g s/\]/%5d/g s/\^/%5e/g s/_/%5f/g s/`/%60/g s/{/%7b/g s/|/%7c/g s/}/%7d/g s/~/%7e/g s/ /%09/g [2] http://cocoon.apache.org/2.1/userdocs/encodeurl-transformer.html --------------080502070802060409060804 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: 8bit Hello Stephen,
I'm using cocoon 2.2.

Background:
I'm trying to build a cocoon presentation layer on top of an
subversion repository. A user makes a request and the cocoon
application checks to see if a directory index needs to be added, adds
menus and navigation trails, handles mobile formatting etc but does a
ci:includexml to get the content of the page from the subversion
repository.

It seems very interesting :-)
Problem:
Some of the files in the subversion repository have spaces in the file
names (and it's not feasible to have a repository without any URL
encoding required characters) which are decoded (by the matcher?) and
then passed off to the reader which makes a request of the source
resolver without re-encoding the spaces (I think).

I assume this is what is happening because I get a FileNotFound error
when using <ci:includexml> with the src set stuff from the request
generator but it works when I use xslt to replace the spaces with
%20s.  I can fix the problem when constructing pages using xslt but
not for images and other stuff that needs to be <map:read>.

Am I going to have to write my own reader that does URL encoding?
From my point of view, you don't have to write a reader to handle you data/link. Because the (right) way to communicate with a server it with URL, which involves URL encoding. It seems that you want to communicate with your server, as with a standard file system. Since it isn't a file system, you have to adapt you to web and web server requirements, i.e. URL encoding.

I guess you use svn list to know what contains your svn repository. (Please give us more details about this part) It must be interpreted via a BASH at some point. So you should also use this BASH to translate your listing into a "URL encoded" listing. I saw that sed with a file urlencode.sed [1] can be very useful to reach this goal : svn list | sed -f urlencode.sed

Finally, you should find a way before using <ci:includexml>, so that you provide your server with data respecting web requirements/rules. There are several ways :
1/ Encode URL via your BASH and the command sed.
2/ Use a XSLT style sheet to transform every occurrence of char forbidden in URL string.
3/ Use EncodeURLTransformer. I'm not sure it releases what you want, but its name seems to be interesting ;-) [2]
4/ Use POST data, because they likely don't require URL encoding.
5/ ... ?

There are many ways to fix your problem and I haven't provide you with a exact fix, sorry. Whatever I wrote, I hope you understand that you have to provide your server with URL encoded URLs. I mean the solution isn't in developing a reader, but more in using or developing a transformer to encode you URL or even in providing your system with URL encoded data since the beginning.

Good luck.

Thomas.

[1] urlencode.sed
s/%/%25/g
s/ /%20/g
s/ /%09/g
s/!/%21/g
s/"/%22/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/'\''/%27/g
s/(/%28/g
s/)/%29/g
s/\*/%2a/g
s/+/%2b/g
s/,/%2c/g
s/-/%2d/g
s/\./%2e/g
s/\//%2f/g
s/:/%3a/g
s/;/%3b/g
s//%3e/g
s/?/%3f/g
s/@/%40/g
s/\[/%5b/g
s/\\/%5c/g
s/\]/%5d/g
s/\^/%5e/g
s/_/%5f/g
s/`/%60/g
s/{/%7b/g
s/|/%7c/g
s/}/%7d/g
s/~/%7e/g
s/      /%09/g

[2] http://cocoon.apache.org/2.1/userdocs/encodeurl-transformer.html
--------------080502070802060409060804--