chemistry-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Florent Guillaume ...@nuxeo.com>
Subject Re: [jira] [Assigned] (CMIS-344) Query parser should not use UTF-8 encoding
Date Thu, 31 Mar 2011 12:17:22 GMT
Note though that SELECT * FROM cmis:document WHERE CONTAINS
('\u4E2D\u6587') isn't actually legal CMISQL, as currently CMISQL has
no notion of Unicode escaping. The query would have to contain actual
Unicode characters.
NB: Unicode escaping is only specified in SQL-2008, not SQL-92. See
this for a summary:
http://hsqldb.org/doc/2.0/guide/dataaccess-chapt.html#N11E65

Florent

On Thu, Mar 31, 2011 at 2:00 PM, Florent Guillaume <fg@nuxeo.com> wrote:
> No objection, I probably wasn't aware of ANTLRStringStream when I
> wrote that code.
>
> Florent
>
> On Thu, Mar 31, 2011 at 12:47 PM, Jens Hübel <jhuebel@opentext.com> wrote:
>> Florent,
>>
>> as far as I remember this code came originally from your side. Would you have any
objections to apply the proposed patch? Would this break something on your side?
>>
>> Jens
>>
>>
>>
>> -----Original Message-----
>> From: Jens Hübel (JIRA) [mailto:jira@apache.org]
>> Sent: Donnerstag, 31. März 2011 12:42
>> To: dev@chemistry.apache.org
>> Subject: [jira] [Assigned] (CMIS-344) Query parser should not use UTF-8 encoding
>>
>>
>>     [ https://issues.apache.org/jira/browse/CMIS-344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
>>
>> Jens Hübel reassigned CMIS-344:
>> -------------------------------
>>
>>    Assignee: Jens Hübel
>>
>>> Query parser should not use UTF-8 encoding
>>> ------------------------------------------
>>>
>>>                 Key: CMIS-344
>>>                 URL: https://issues.apache.org/jira/browse/CMIS-344
>>>             Project: Chemistry
>>>          Issue Type: Bug
>>>          Components: opencmis-server
>>>    Affects Versions: OpenCMIS 0.4.0
>>>            Reporter: Michael Dürig
>>>            Assignee: Jens Hübel
>>>         Attachments: CMIS-344.patch
>>>
>>>
>>> QueryUtil converts the query statement to a UTF-8 encoded byte array which is
used as input to the lexer instead of using the string directly.
>>> Instead of
>>>     CharStream input = new ANTLRInputStream(new ByteArrayInputStream(statement.getBytes("UTF-8")));
>>> the input stream should be obtained like this:
>>>     CharStream input = new ANTLRStringStream(statement);
>>> The former method transforms the characters in the contains clause of the query
>>>     SELECT * FROM cmis:document WHERE CONTAINS ('\u4E2D\u6587')
>>> in an incorrect way.
>>
>> --
>> This message is automatically generated by JIRA.
>> For more information on JIRA, see: http://www.atlassian.com/software/jira
>>
>
>
>
> --
> Florent Guillaume, Director of R&D, Nuxeo
> Open Source, Java EE based, Enterprise Content Management (ECM)
> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>



-- 
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Mime
View raw message