esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xuefeng Wu <ben...@gmail.com>
Subject Re: ESME-26 The message parser should ignore # in urls
Date Thu, 15 Oct 2009 09:59:41 GMT
Why the name is *escape*, anyone could explain?

On Thu, Oct 15, 2009 at 5:51 PM, Xuefeng Wu <benewu@gmail.com> wrote:

> I think I found the wrong location.
> The guess before is wrong, that '%' is not very special in Scala but
> in MsgParser.
>
> I found this code:
>   lazy val hex: Parser[Elem] = elem("Hex", c => (c >= '0' && c <= '9')
||
>                                     (c >= 'a' && c <= 'z') || (c >=
'A' &&
> c <= 'Z'))
>
>   lazy val escape: Parser[Elem] = '%' ~> hex ~ hex ^^ {
>     case high ~ low => Integer.parseInt(high.toString + low.toString,
> 16).toChar
>   }
>
> I guess *escape try to parse char from some code to character.*
> *
> *
> *I think the hex should be:*
> *   lazy val hex: Parser[Elem] = elem("Hex", c => (c >= '0' && c <=
'9')
> ||
>                                     (c >= 'a' && c <= 'f') || (c >=
'A' &&
> c <= 'F'))
> The max char is 'f' but not 'z' in hex.
>
> And modify:
>    lazy val hsegment: Parser[String] = rep(uchar | ';' | ':' | '@' | '&' |
> '=' | '#' | '~' | '%') ^^
>   {_.mkString}
>
> ESME can accept '%' in URL now.
>
> But it's not final.
>
> 1. it still can not accept Chinese Character in URL. It's not very import
> for few URL include Chinese Character, which will encode to UFT-8 or other
> code.
>
> but the 2 is important.
> 2. escape can not tell when to encode and can not encode correctly,
> Sorry my poor English, For example:
> After modify,
> 1. ESME could accept : http://www.google.com/%kk
>
> 2. When it get: http://www.google.com/%E8%AE%BE%E8%AE%A1,
> escape will try to translate hex into some code, but he fail for Chinese
> Character code has two char.
> it get total wrong url: http://www.google.com/è<http://www.google.com/%C3%A8>
> ®¾è®¡
>
> 3.  When it get:
>
> https://issues.apache.org/jira/browse/ESME-26?focusedCommentId=12764422&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12764422
> *
> *escape will try to parse '' %3A", you know it's wrong but he didn't.
>
> *
> I think escape is too weak to parse code, my suggest it should be more
> power and strong.
> Maybe we need an independent util object, or maybe lift have done for this
> task.
>
>
> I submitted this patch although it's not perfect.
>
>
> On Thu, Oct 15, 2009 at 3:56 PM, Xuefeng Wu <benewu@gmail.com> wrote:
>
>> I did little more test,I input this message:
>> http://www.google.com/search?&q=设计<http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
>> parse manager didn't think '设计' is a part of URL.
>>
>>
>> On Thu, Oct 15, 2009 at 1:49 PM, Richard Hirsch <hirsch.dick@gmail.com>wrote:
>>
>>> Obviously, we need to look at the message parsing in more detail.
>>> There appear to be a variety of problems.
>>>
>>> @Xuefeng I'm glad you are on the team and can test using Chinese
>>> characters.
>>>
>>> On Thu, Oct 15, 2009 at 6:41 AM, Xuefeng Wu <benewu@gmail.com> wrote:
>>> > The big trouble is that it related with encode.
>>> > When I paste: http://www.google.com/search?&q=设计<http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
>>> <http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
>>> > in
>>> > the message box,
>>> > I got: http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1
>>> > After update this message,ESME show:
>>> > http://www.google.com/search?&q=è<http://www.google.com/search?&q=%C3%A8>
>>> ®¾è®¡<http://localhost:8080/u/CRHNGPZKN5C12WPN>
>>> >
>>> > As you see, It's totally confused.
>>> >
>>> > <http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>"设计" 
is Chinese
>>> > Character, which means design, please ignore the meaning.
>>> >
>>> > P.S. Gmail can not parse
>>> > http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1into URL.
>>> >
>>> > On Wed, Oct 14, 2009 at 5:21 PM, Xuefeng Wu <benewu@gmail.com> wrote:
>>> >
>>> >> Scala try to parse it.
>>> >>
>>> >>
>>> >> On Wed, Oct 14, 2009 at 5:18 PM, Xuefeng Wu <benewu@gmail.com>
wrote:
>>> >>
>>> >>> Have big trouble for '%'.% is very special in scala.
>>> >>> %55 is anther character.
>>> >>>
>>> >>>
>>> >>> On Wed, Oct 14, 2009 at 5:03 PM, Richard Hirsch <
>>> hirsch.dick@gmail.com>wrote:
>>> >>>
>>> >>>> Yes but the "%" is a special character that is present in more
>>> places
>>> >>>> in the scala file.
>>> >>>>
>>> >>>> On Wed, Oct 14, 2009 at 10:56 AM, Xuefeng Wu <benewu@gmail.com>
>>> wrote:
>>> >>>> > Maybe It don't support %
>>> >>>> >
>>> >>>> > On Wed, Oct 14, 2009 at 4:54 PM, Richard Hirsch <
>>> hirsch.dick@gmail.com
>>> >>>> >wrote:
>>> >>>> >
>>> >>>> >> I tried some urls and they work but others still have
problems.
>>> >>>> >>
>>> >>>> >> For example, this URL still causes problems:
>>> >>>> >>
>>> >>>> >>
>>> >>>> >>
>>> >>>>
>>> https://issues.apache.org/jira/browse/ESME-26?focusedCommentId=12764422&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12764422
>>> >>>> >>
>>> >>>> >> This URL is OK:
>>> >>>> >>
>>> >>>> >>
>>> https://issues.apache.org:443/jira/browse/ESME-26#Action_12765458
>>> >>>> >>
>>> >>>> >> D.
>>> >>>> >>
>>> >>>> >> On Wed, Oct 14, 2009 at 10:40 AM, Xuefeng Wu <benewu@gmail.com>
>>> >>>> wrote:
>>> >>>> >> > Thank you
>>> >>>> >> >
>>> >>>> >> > On Wed, Oct 14, 2009 at 4:36 PM, Richard Hirsch
<
>>> >>>> hirsch.dick@gmail.com
>>> >>>> >> >wrote:
>>> >>>> >> >
>>> >>>> >> >> Just the code itself at this point - sorry.
>>> >>>> >> >>
>>> >>>> >> >> I'm trying out your patch right now.
>>> >>>> >> >>
>>> >>>> >> >> D.
>>> >>>> >> >>
>>> >>>> >> >> On Wed, Oct 14, 2009 at 10:28 AM, Xuefeng
Wu <
>>> benewu@gmail.com>
>>> >>>> wrote:
>>> >>>> >> >> > Hi,
>>> >>>> >> >> > I think I found how resolve this issue
and add a patch.
>>> >>>> >> >> >
>>> >>>> >> >> > https://issues.apache.org/jira/browse/ESME-26
>>> >>>> >> >> >
>>> >>>> >> >> > <https://issues.apache.org/jira/browse/ESME-26>But
I'm not
>>> >>>> confirm
>>> >>>> >> about
>>> >>>> >> >> it
>>> >>>> >> >> > how it work, I can not understand parser
combinator?
>>> >>>> >> >> > Are there any  resource for learn?
>>> >>>> >> >> >
>>> >>>> >> >> > --
>>> >>>> >> >> > Global R&D Center,Shanghai China,Carestream
Health, Inc.
>>> >>>> >> >> > Tel:(86-21)3852 6101
>>> >>>> >> >> >
>>> >>>> >> >>
>>> >>>> >> >
>>> >>>> >> >
>>> >>>> >> >
>>> >>>> >> > --
>>> >>>> >> > Global R&D Center,Shanghai China,Carestream
Health, Inc.
>>> >>>> >> > Tel:(86-21)3852 6101
>>> >>>> >> >
>>> >>>> >>
>>> >>>> >
>>> >>>> >
>>> >>>> >
>>> >>>> > --
>>> >>>> > Global R&D Center,Shanghai China,Carestream Health,
Inc.
>>> >>>> > Tel:(86-21)3852 6101
>>> >>>> >
>>> >>>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>> Global R&D Center,Shanghai China,Carestream Health, Inc.
>>> >>> Tel:(86-21)3852 6101
>>> >>>
>>> >>
>>> >>
>>> >>
>>> >> --
>>> >> Global R&D Center,Shanghai China,Carestream Health, Inc.
>>> >> Tel:(86-21)3852 6101
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Global R&D Center,Shanghai China,Carestream Health, Inc.
>>> > Tel:(86-21)3852 6101
>>> >
>>>
>>
>>
>>
>> --
>> Global R&D Center,Shanghai China,Carestream Health, Inc.
>> Tel:(86-21)3852 6101
>>
>
>
>
> --
> Global R&D Center,Shanghai China,Carestream Health, Inc.
> Tel:(86-21)3852 6101
>



-- 
Global R&D Center,Shanghai China,Carestream Health, Inc.
Tel:(86-21)3852 6101

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message