camel-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sergey Sidashov (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CAMEL-8356) IOConverter.toInputStream(file, charset) returns strange behaving stream
Date Tue, 02 Jun 2015 05:03:17 GMT

    [ https://issues.apache.org/jira/browse/CAMEL-8356?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568526#comment-14568526
] 

Sergey Sidashov commented on CAMEL-8356:
----------------------------------------

It seems encoding problem with IOConverter still exists. I try to load text file in cp1251
encoding, using file component (uri=file:C:\addr\in\?charset=cp1251 for example). Then I write
bean with method:

public static String convertStreamToString(InputStream inputStream) throws IOException {
        if (inputStream == null) return null;
        StringBuilder sb = new StringBuilder(2048); // Define a size if you have an idea of
it.
        char[] read = new char[128]; // Your buffer size.
        try (InputStreamReader ir = new InputStreamReader(inputStream, "cp1251")) {
            for (int i; -1 != (i = ir.read(read)); sb.append(read, 0, i));
        } catch (Throwable t) {}
        return sb.toString();
    }
to test conversion from File to InputStream. This stream for some files reads all content
successfully, but for some files it clips contents of file. It seems file reading ends with
some characters (for example, in cp1251 encoding, file reading ends with characters 'яя').
Camel version 2.15.2, java version 1.8.0_45.

> IOConverter.toInputStream(file, charset) returns strange behaving stream
> ------------------------------------------------------------------------
>
>                 Key: CAMEL-8356
>                 URL: https://issues.apache.org/jira/browse/CAMEL-8356
>             Project: Camel
>          Issue Type: Bug
>          Components: camel-core
>    Affects Versions: 2.14.1, 2.15.0
>            Reporter: Stefan Mandel
>            Assignee: Willem Jiang
>             Fix For: 2.14.2, 2.15.0
>
>         Attachments: CAMEL8356-repaired-Test-and-adjusted-converter-imple.patch, IOConverterCharsetTest.java,
german.iso-8859-1.txt, german.utf-8.txt
>
>
> Calling IOConverter.toInputStream with either UTF-8 or ISO-8859-1 returns a stream that
behaves strange on non-ascii-characters:
> - putting this stream into an InputStreamReader will return false encoded characters
> - a naive new BufferedReader(new InputStreamReader(new FileInputStream(file), charset))
will return the correctly encoded characters.
> I will attach some unit tests for this case.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message