manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CONNECTORS-1251) Confluence umlauts broken
Date Tue, 01 Dec 2015 15:40:11 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033888#comment-15033888
] 

Karl Wright commented on CONNECTORS-1251:
-----------------------------------------

Well, with this change, I can find nothing specifically wrong with the implementation.  The
only possibility would be if Confluence mis-sets its Content-Type header when it responds
to REST requests, and then of course that confuses HttpClient.

Can you turn on httpclient wire logging and verify what the headers are for the request/response
to a page that has umlauts on it?  There's a wiki page on how to do that with MCF; let me
know if you have trouble.  THanks!



> Confluence umlauts broken
> -------------------------
>
>                 Key: CONNECTORS-1251
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1251
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Confluence connector
>    Affects Versions: ManifoldCF 2.2
>         Environment: Ubuntu Linux 14.04
> Java 1.8.0_51-b16
> Tomcat 7.0.52
>            Reporter: Jens Grassel
>            Assignee: Antonio David Pérez Morales
>              Labels: umlauts, unicode
>             Fix For: ManifoldCF 2.3
>
>
> Hi,
> I've noticed that the confluence connector seems to be unable to cope with special characters
like umlauts (ä, ö, ü, etc.). In our index they are broken for example {{ü}} becomes {{ü}}.
> I tried to pipe the extracted content through the tika extractor but the result was the
same.
> Regards,
> Jens



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message