Return-Path: Delivered-To: apmail-tomcat-users-archive@www.apache.org Received: (qmail 85811 invoked from network); 21 Sep 2009 09:46:39 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 21 Sep 2009 09:46:39 -0000 Received: (qmail 55313 invoked by uid 500); 21 Sep 2009 09:46:35 -0000 Delivered-To: apmail-tomcat-users-archive@tomcat.apache.org Received: (qmail 55270 invoked by uid 500); 21 Sep 2009 09:46:35 -0000 Mailing-List: contact users-help@tomcat.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Tomcat Users List" Delivered-To: mailing list users@tomcat.apache.org Received: (qmail 55226 invoked by uid 99); 21 Sep 2009 09:46:35 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Sep 2009 09:46:35 +0000 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of aw@ice-sa.com designates 212.85.38.228 as permitted sender) Received: from [212.85.38.228] (HELO tor.combios.es) (212.85.38.228) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 21 Sep 2009 09:46:25 +0000 Received: from localhost (localhost [127.0.0.1]) by tor.combios.es (Postfix) with ESMTP id B3E8F226075 for ; Mon, 21 Sep 2009 11:46:03 +0200 (CEST) Received: from tor.combios.es ([127.0.0.1]) by localhost (tor.combios.es [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h6GcqOcmaKOu for ; Mon, 21 Sep 2009 11:46:03 +0200 (CEST) Received: from [192.168.245.129] (montserrat.wissensbank.com [212.85.37.175]) by tor.combios.es (Postfix) with ESMTPA id 63B33226073 for ; Mon, 21 Sep 2009 11:46:03 +0200 (CEST) Message-ID: <4AB74B3E.8070106@ice-sa.com> Date: Mon, 21 Sep 2009 11:45:34 +0200 From: =?UTF-8?B?QW5kcsOpIFdhcm5pZXI=?= Reply-To: Tomcat Users List User-Agent: Thunderbird 2.0.0.23 (Windows/20090812) MIME-Version: 1.0 To: Tomcat Users List Subject: Re: Create FileInputStream in servlet from remote file with accentuated character name References: <4AB235CF.5020908@continew.fr> <4AB26714.4030800@christopherschultz.net> <4AB37E92.5040400@continew.fr> <4AB3AE6D.3000602@christopherschultz.net> <4AB73283.6030300@continew.fr> In-Reply-To: <4AB73283.6030300@continew.fr> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Virus-Checked: Checked by ClamAV on apache.org Sylvie Perrin wrote: > Christopher, > > Here is the stack trace of the FileNotFoundException: > > java.io.FileNotFoundException: /home/me/mountDir/fichi��.txt (No such file or > directory) Sylvie, maybe what appears above shows the origin of the problem, and explains what I was trying previously to tell you. It is difficult to be sure, because (again) there are several layers of encoding/decoding between your logfile, and how it may show up in this email. The problem is not your problem per se. You are not necessarily doing anything wrong. The problem is basically in the lack of a common standard between different OS'es and filesystem types, about how to represent filenames containing non-US-ASCII characters. Below, I am trying to explain the root of the problem, concisely but fully. It *is* a complex matter, that's why it is confusing. But you are not alone in being confused or puzzled. Unless one has had to deal with such issues many times, it is really easy to get confused, because in this case, what one sees is not necessarily what one gets. Assuming that what I see above is also what you see in the logfile ("fichi" + 2 strange characters + ".txt") : - java is trying to open a file named "fichi" + 2 strange characters + ".txt" - these two characters *may* be the Unicode/UTF-8 encoding of the character "é" (e with acute accent) - but java is not finding that file (obviously) Furthermore : The file is really located on a Windows server. The Windows directory where the file is located, is "mounted" through the CIFS filesystem, onto a local mountpoint on your (Linux) Java and Tomcat host. On your Java/Tomcat host, Java is seeing the contents of this directory *through* this CIFS filesystem mount. In principle (but that is only an assumption here), the CIFS filesystem code (running on the localhost) shows this (remote) directory content to a local application "as is", without making any character set translation. Now Java (on your local system) is trying to find this file "fichiXX.txt", and not finding it. (XX being 2 the two unknown bytes) That means that, on the remote system, this file "fichXX.txt" does not exist. If you connect to that remote system via, for instance, a Remote Desktop or a VNC console (or even from your local station, just browse this "share" through the Windows Explorer), and examine the content of that directory, you probably see a file named "fichié.txt". But that is only what you *see*, through whatever interface you use. In reality, the "é" in this filename may (or may not) be encoded, in the Windows directory entry, as 2 bytes. Or it may be encoded with (for instance) a Windows 8-bit codepage, as a single byte. If so, that is why Java, which is trying to find this "é" as 2 bytes, does not find it. Now comes the difficult part : To solve your problem thus, you have to make sure that when Java is looking for a filename which, from the Java point of view, contains an "é" character, this Java "é" *character* (whatever its representation is as bytes in Java), matches the byte representation of the "é" character, in the filesystem of the remote host where the file actually resides. And the problem is, that these two "systems" (Java and your current platform) and the remote OS, do not necessarily agree on what this byte representation of an "é" character is. For example, suppose you find the right set of measures that make your Java program find the file in the end. Then, you replace the Windows fileserver by a Linux server, sharing its files through Samba. Well, the problem may then show up again, because the encoding may be different again. That is why I was recommending to stick to US-ASCII names. It was not a joke. --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org For additional commands, e-mail: users-help@tomcat.apache.org