jackrabbit-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Woonsan Ko <woon...@apache.org>
Subject Re: Node name with non-ASCII characters when using davex remoting
Date Fri, 08 Jun 2018 16:17:57 GMT
On Fri, Jun 8, 2018 at 5:18 AM, Mathieu Baudier <mbaudier@argeo.org> wrote:
> Hello,
>
> I am facing a problem when trying to create a node (of type nt:file) via
> davex remoting when the node name contains non-ASCII characters (e.g. Zones
> Géographiques.pdf in my case). We are using Jackrabbit v2.16.0.
>
> I made sure that both the server and the client use UTF-8 as default
> charset and I have debugged both sides in depth. It seems to me that the
> problem is on the client side when the diff is sent as multiform parts.
>
> In org.apache.jackrabbit.spi2davex.RepositoryServiceImpl, the StringBuilder
> 'buf' looks good (line ~610):

I have neither experience nor knowledge about the JCR remoting module.
But as you already have looked into the code ;-), perhaps you can try
with this simple one-liner change:

            MultipartEntityBuilder b =
MultipartEntityBuilder.create().setCharset(Charset.forName("UTF-8"));

Even if Utils.addPart(...) contains the charset for the mime part, the
charset seems to be ignored in the end by the MultipartEntityBuilder
unless specified like the above.

Regards,

Woonsan

>
> +/home/root/Documents/old/Zones Géographiques.pdf :
> {"jcr:primaryType":"nt:file"}
> ^/home/root/Documents/old/Zones Géographiques.pdf/jcr:mixinTypes : []
> +/home/root/Documents/old/Zones Géographiques.pdf/jcr:content :
> {"jcr:primaryType":"nt:resource"}
> ^/home/root/Documents/old/Zones Géographiques.pdf/jcr:content/jcr:data :
>
> But the HttpPost 'request' to which it is added (via a
> MultipartEntityBuilder) shows that request.entity.multipart.charset is
> US_ASCII.
>
> When it is received on the server side, the FileItem(s) created in
> org.apache.jackrabbit.server.util.HttpMultipartPost are shown as having
> (line ~78):
> fileItem.contentType is 'jcr-value/name; charset=UTF-8'
> fieldName is '/home/root/Documents/old/Zones
> Ge?ographiques.pdf/jcr:mixinTypes'
> (note the 'e?' instead of 'é')
>
> Further fieldName values:
> /home/root/Documents/old/Zones Ge?ographiques.pdf/jcr:mixinTypes
> /home/root/Documents/old/Zones Ge?ographiques.pdf/jcr:content/jcr:data
> diff:
>
> It then fails with:
>
> javax.jcr.nodetype.ConstraintViolationException:
> /home/root/Documents/old/Zones Géographiques.pdf/jcr:content: mandatory
> property {http://www.jcp.org/jcr/1.0}data does not exist
>     at
> org.apache.jackrabbit.core.ItemSaveOperation.validateTransientItems(ItemSaveOperation.java:537)
>     at
> org.apache.jackrabbit.core.ItemSaveOperation.perform(ItemSaveOperation.java:216)
>     at
> org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:216)
>     at org.apache.jackrabbit.core.ItemImpl.perform(ItemImpl.java:91)
>     at org.apache.jackrabbit.core.ItemImpl.save(ItemImpl.java:329)
>     at
> org.apache.jackrabbit.core.session.SessionSaveOperation.perform(SessionSaveOperation.java:65)
>     at
> org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:216)
>     at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:363)
>     at org.apache.jackrabbit.core.SessionImpl.save(SessionImpl.java:852)
>     at
> org.apache.jackrabbit.server.remoting.davex.JcrRemotingServlet.processDiff(JcrRemotingServlet.java:562)
>     at
> org.apache.jackrabbit.server.remoting.davex.JcrRemotingServlet.doPost(JcrRemotingServlet.java:427)
>     at
> org.apache.jackrabbit.webdav.server.AbstractWebdavServlet.execute(AbstractWebdavServlet.java:368)
>     at
> org.apache.jackrabbit.webdav.server.AbstractWebdavServlet.service(AbstractWebdavServlet.java:305)
>     ...
>
> The same file renamed without non-ASCII characters is properly imported
> without any problem.
>
> I have searched the web and JIRA but did not find anything. Is it a known
> issue?
> Am I missing something in the configuration?
> Would anyone have an idea of a workaround which does not require patching,
> like forcing the default charset for the Apache Http Client via System
> properties or equivalent?
>
> I wanted to first check with the mailing-list before booking a bug, but I
> am of course happy to prepare one with some simple code reproducing the
> issue (I am here in the middle of a non-trivial system, and I am wondering
> whether this could be a side-effect from other components).
>
> Thanks in advance for your advice,
>
> Mathieu

Mime
View raw message