camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Volodymyr Sobotovych (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CAMEL-8191) Charset is ignored for SFTP producer endpoints
Date Sun, 25 Jan 2015 17:28:34 GMT

    [ https://issues.apache.org/jira/browse/CAMEL-8191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14291186#comment-14291186
] 

Volodymyr Sobotovych commented on CAMEL-8191:
---------------------------------------------

I also noticed some incorrectness in description of "charset" option in documentation (http://camel.apache.org/file2.html):

Camel 2.9.3: this option is used to specify the encoding of the file, _and camel will set
the Exchange property with Exchange.CHARSET_NAME with the value of this option_. You can use
this on the consumer, to specify the encodings of the files, which allow Camel to know the
charset it should load the file content in case the file content is being accessed. Likewise
when writing a file, you can use this option to specify which charset to write the file as
well. See further below for a examples and more important details.

The incorrectness is highlighted in _italic_ above. No endpoint (file, ftp, sftp) sets Exchange.CHARSET_NAME
as illustrated by the output of this test:
{code}
public class FileEncodingTest extends CamelTestSupport {
    @Test
    public void testFileEncoding() {
        template.sendBody("direct:in", "Hi there");
    }

    @Override
    protected RouteBuilder createRouteBuilder() throws Exception {
        return new RouteBuilder() {

            @Override
            public void configure() throws Exception {
                from("direct:in")
                        .log("Charset name header (1): ${header.CamelCharsetName}")
                        .to("file://output.txt?charset=iso-8859-1")
                        .log("Charset name header (2): ${header.CamelCharsetName}")
                        .setHeader(Exchange.CHARSET_NAME, constant("iso-8859-1"))
                        .log("Charset name header (3): ${header.CamelCharsetName}");
            }
        };
    }
}
{code}

{code}
[                          main] route1                         INFO  Charset name header
(1): 
[                          main] SendProcessor                  DEBUG >>>> Endpoint[file://output.txt?charset=iso-8859-1]
Exchange[Message: Hi there]
[                          main] FileOperations                 DEBUG Using Reader to write
file: output.txt/ID-wheleph-Lenovo-G570-42931-1422203242220-0-1 with charset: iso-8859-1
[                          main] GenericFileProducer            DEBUG Wrote [output.txt/ID-wheleph-Lenovo-G570-42931-1422203242220-0-1]
to [Endpoint[file://output.txt?charset=iso-8859-1]]
[                          main] route1                         INFO  Charset name header
(2): 
[                          main] route1                         INFO  Charset name header
(3): iso-8859-1
{code}

> Charset is ignored for SFTP producer endpoints
> ----------------------------------------------
>
>                 Key: CAMEL-8191
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8191
>             Project: Camel
>          Issue Type: Improvement
>          Components: camel-ftp
>    Affects Versions: 2.12.3, 2.14.1
>         Environment: vso@vso-desktop:/tmp$ uname -a
> Linux vso-desktop 3.13.0-43-generic #72~precise1-Ubuntu SMP Tue Dec 9 12:14:18 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux
> vso@vso-desktop:/tmp$ java -version
> java version "1.7.0_65"
> OpenJDK Runtime Environment (IcedTea 2.5.3) (7u71-2.5.3-0ubuntu0.12.04.1)
> OpenJDK 64-Bit Server VM (build 24.65-b04, mixed mode)
> vso@vso-desktop:/tmp$ ssh -v
> OpenSSH_5.9p1 Debian-5ubuntu1.4, OpenSSL 1.0.1 14 Mar 2012
>            Reporter: Volodymyr Sobotovych
>              Labels: charset, sftp
>             Fix For: 2.14.2, 2.15.0
>
>         Attachments: CAMEL-8191.patch
>
>
> For SFTP producer endpoints option "charset" is ignored and the output file is created
using platform-default charset (usually UTF-8). 
> The simple Spring context illustrates the issue:
> {code}
> <beans xmlns="http://www.springframework.org/schema/beans"
>        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>        xsi:schemaLocation="
>        http://www.springframework.org/schema/beans 
>        http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
>        http://camel.apache.org/schema/spring 
>        http://camel.apache.org/schema/spring/camel-spring.xsd">
>   <camelContext xmlns="http://camel.apache.org/schema/spring">
>     <route>
>       <from uri="stream:in?promptMessage=Enter something:" />
>       <to uri="sftp://localhost:22/vso/sandbox?charset=ISO-8859-1&amp;username=fake_sftp_user&amp;password=qwerty"/>
>     </route>
>   </camelContext>
> </beans>
> {code}
> This context defines a route that transfers the string entered by user via SFTP. If the
user enters "Müller", I can see 7-byte message in the output directory (because "ü" is represented
using 2 bytes in UTF-8). While it should be 6-byte message if the file was encoded in ISO-8859-1.
> This problem affects only SFTP endpoints. File and FTP endpoints treat the "charset"
option correctly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message