manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karl Wright (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CONNECTORS-1251) Confluence umlauts broken
Date Wed, 09 Dec 2015 08:38:11 GMT

    [ https://issues.apache.org/jira/browse/CONNECTORS-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033888#comment-15033888
] 

Karl Wright edited comment on CONNECTORS-1251 at 12/9/15 8:37 AM:
------------------------------------------------------------------

[~jan0sch] Well, with this change, I can find nothing specifically wrong with the implementation.
 The only possibility would be if Confluence mis-sets its Content-Type header when it responds
to REST requests, and then of course that confuses HttpClient.

Can you turn on httpclient wire logging and verify what the headers are for the request/response
to a page that has umlauts on it?  There's a wiki page on how to do that with MCF; let me
know if you have trouble.

Also, FWIW, we're trying to start the release process for 2.3 this weekend, so it would be
good to chase this problem down before then if possible.

Thanks!




was (Author: kwright@metacarta.com):
Well, with this change, I can find nothing specifically wrong with the implementation.  The
only possibility would be if Confluence mis-sets its Content-Type header when it responds
to REST requests, and then of course that confuses HttpClient.

Can you turn on httpclient wire logging and verify what the headers are for the request/response
to a page that has umlauts on it?  There's a wiki page on how to do that with MCF; let me
know if you have trouble.  THanks!



> Confluence umlauts broken
> -------------------------
>
>                 Key: CONNECTORS-1251
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1251
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: Confluence connector
>    Affects Versions: ManifoldCF 2.2
>         Environment: Ubuntu Linux 14.04
> Java 1.8.0_51-b16
> Tomcat 7.0.52
>            Reporter: Jens Grassel
>            Assignee: Antonio David Pérez Morales
>              Labels: umlauts, unicode
>             Fix For: ManifoldCF 2.3
>
>
> Hi,
> I've noticed that the confluence connector seems to be unable to cope with special characters
like umlauts (ä, ö, ü, etc.). In our index they are broken for example {{ü}} becomes {{ü}}.
> I tried to pipe the extracted content through the tika extractor but the result was the
same.
> Regards,
> Jens



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message