jspwiki-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Simon Kitching ...@ops.co.at>
Subject Re: non-ascii characters in file names
Date Thu, 04 Sep 2008 09:21:25 GMT
Hi Florian,

Thanks for testing that. I should have thought of trying this out on the 
jspwiki site.

Your examples on the jspwiki site work fine for me. This is not too 
surprising; there are enough non-english jspwiki installations that 
people would have raised this before if it was a bug everywhere. So it's 
something to do with my local setup, either:
* I've somehow installed jspwiki wrong, or
* something to do with my OS, or
* something to do with my jvm

After more investigation, I've found that the page containing the links 
seems fine; the links do have the correct encoding in them. But by the 
time the pageName is passed to the FileSystemProvider class it has been 
corrupted. As a result, when I create the page it gets saved using a 
filename based on the corrupted pageName.

And it is not browser-specific; the problem also occurs on my site with 
IE6 and Opera.

BTW, I am using tomcat 6.0.16, standalone.

I have run "ant tests", and get a number of suspicious-looking failures 
(1005 tests, 25 failures, 4 errors).

The problems seem to fall into two categories:

(1) unit tests that are just language-sensitive:

expected:<...Attempt to output javascript...> but was:<...Versuch, 
Javascript auszugeben...>|
|expected:<...Create "HyperLink"...> but was:<...Erstelle HyperLink...>
etc

(2) problems with pageNames

These could well be symptoms of the same issue I'm having. And if they 
are, then it points more strongly to some kind of OS/JVM issue rather 
than an incorrectly-installed wiki.

testCollectingLinksAttachment: Parent page does not exist|
|testMassiveRepository1: Right number of pages expected:<1000> but 
was:<1001>|
|testBug85_case1: Page does not exist anymore
testMaxReferences: expected:<5> but was:<6|>
|testSpacedNames1: lowercase expected:<puppaa> but was:<>

I'll debug these tests and see what is happening.
|
||||



|||Just as a note, I do get a bunch of warnings like these:
[javac] 
/home/sk/projects/jspwiki/tests/com/ecyrd/jspwiki/providers/VersioningFileProviderTest.java:176:

warning: unmappable character for encoding UTF8
[javac] String text2 = "barbar??\r\n";
This suggests to me that these java files have been saved in some 
non-utf8 character encoding, and the build.xml file does not tell the 
javac compiler what character encoding these files are in.

Cheers
Simon

Florian Holeczek schrieb:
> Hallo Simon,
>
> I don't know whether I've properly understood your concern.
> I've just tested the following pages without any errors in JSPWiki
> v2.7.0-alpha-34 (sandbox.jspwiki.org) and v2.6.4 (www.jspwiki.org):
>
> [FHTestÄ]
> [fhtestö]
>
> Both are working as expected. See
> http://www.jspwiki.org/wiki/FlorianHoleczek (end of the page) and
> http://sandbox.jspwiki.org/Wiki.jsp?page=Main (beginning of the page,
> only today).
>
> Again, which container are you using? We recently found an issue with
> the OC4J container and umlauts - have a look at the bug tracker.
>
> Regards
>  Florian
>
> Ursprüngliche Nachricht vom 04.09.2008 um 09:41:
>   
>> Thanks for the suggestion. Unfortunately it made no difference at all.
>>     
>
>   
>> As before, the pages still can *contain* umlaut characters fine. But 
>> using such a character in a page name causes:
>> * bad filename encoding (all umlaut chars encoded as %C3%83)
>> * bad pagename display: all umlaut chars display as a-with-tilde
>>     
>
>   
>> I'll have a look at the source. Any suggestions for classes to start 
>> with will be welcome..
>>     
>
>   
>> Regards,
>> Simon
>>     
>
>   
>> Florian Holeczek schrieb:
>>     
>>> Hallo Simon,
>>>
>>> which servlet container are you using?
>>> Did you already have a look at
>>> http://www.jspwiki.org/wiki/TomcatAndUTF8 ?
>>>
>>> Regards
>>>  Florian
>>>
>>> Ursprüngliche Nachricht vom 03.09.2008 um 17:14:
>>>   
>>>       
>>>> Hi,
>>>>     
>>>>         
>>>   
>>>       
>>>> I'm having trouble with JSPWiki 2.6.3 and unicode characters. I would 
>>>> appreciate some help.
>>>>     
>>>>         
>>>   
>>>       
>>>> I've installed jspwiki 2.6.3 on SuSe linux, which is UTF-8 by default:
>>>>     
>>>>         
>>>  >> locale
>>>   
>>>       
>>>> LANG=de_DE.UTF-8
>>>> LC_CTYPE="de_DE.UTF-8"
>>>>     
>>>>         
>>>   
>>>       
>>>>  And I've left the jspwiki.properties setting of "jspwiki.encoding = 
>>>> UTF-8" alone.
>>>>     
>>>>         
>>>   
>>>       
>>>> I then create a page "sktest1", with a link to a page that has a 
>>>> lowercase german a-umlaut char in it.
>>>> The page (and the link text) look file; the a-umlaut is displayed correctly.
>>>>     
>>>>         
>>>   
>>>       
>>>> Clicking on the link brings up the "edit" window, but the page name is
>>>> corrupted: it shows uppercase-a-with-tilde, not lowercase-a-with-umlaut.
>>>>     
>>>>         
>>>   
>>>       
>>>> The filename created on disk is "Sktest1%C3%83.txt".
>>>>     
>>>>         
>>>   
>>>       
>>>> If I create a page with u-umlaut, then that character also gets encoded
>>>> as "%C3%83", ie it is not possible to have files "Sktestä" and 
>>>> "Sktestü", as they result in the same filename.
>>>>     
>>>>         
>>>   
>>>       
>>>> Interestingly, the first char of the filename appears to be forced to 
>>>> uppercase, but I don't really care here. However any character following
>>>> a non-ascii char appears to also be forced to uppercase:
>>>>   blätter  (that's an a-umlaut)
>>>> becomes
>>>>   bl%C3%83Tter
>>>> (note that first t has become a T).
>>>>     
>>>>         
>>>   
>>>       
>>>> BTW; I'm testing with Firefox 3.x.
>>>>     
>>>>         
>>>   
>>>       
>>>> Hopefully I've just made some minor config mistake, but I can't see what
>>>> at the moment. Any suggestions gratefully received!
>>>>     
>>>>         
>>>   
>>>       
>>>> Regards,
>>>> Simon
>>>>     
>>>>         
>>>   
>>>       
>
>   


Mime
View raw message