incubator-esme-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xuefeng Wu <ben...@gmail.com>
Subject Re: ESME-26 The message parser should ignore # in urls
Date Thu, 15 Oct 2009 09:51:40 GMT
I think I found the wrong location.
The guess before is wrong, that '%' is not very special in Scala but
in MsgParser.

I found this code:
  lazy val hex: Parser[Elem] = elem("Hex", c => (c >= '0' && c <= '9') ||
                                    (c >= 'a' && c <= 'z') || (c >= 'A' &&
c
<= 'Z'))

  lazy val escape: Parser[Elem] = '%' ~> hex ~ hex ^^ {
    case high ~ low => Integer.parseInt(high.toString + low.toString,
16).toChar
  }

I guess *escape try to parse char from some code to character.*
*
*
*I think the hex should be:*
*  lazy val hex: Parser[Elem] = elem("Hex", c => (c >= '0' && c <= '9') ||
                                    (c >= 'a' && c <= 'f') || (c >= 'A' &&
c
<= 'F'))
The max char is 'f' but not 'z' in hex.

And modify:
   lazy val hsegment: Parser[String] = rep(uchar | ';' | ':' | '@' | '&' |
'=' | '#' | '~' | '%') ^^
  {_.mkString}

ESME can accept '%' in URL now.

But it's not final.

1. it still can not accept Chinese Character in URL. It's not very import
for few URL include Chinese Character, which will encode to UFT-8 or other
code.

but the 2 is important.
2. escape can not tell when to encode and can not encode correctly,
Sorry my poor English, For example:
After modify,
1. ESME could accept : http://www.google.com/%kk

2. When it get: http://www.google.com/%E8%AE%BE%E8%AE%A1,
escape will try to translate hex into some code, but he fail for Chinese
Character code has two char.
it get total wrong url: http://www.google.com/è<http://www.google.com/%C3%A8>
®¾è®¡

3. When it get:
https://issues.apache.org/jira/browse/ESME-26?focusedCommentId=12764422&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12764422
*
*escape will try to parse ''%3A", you know it's wrong but he didn't.

*
I think escape is too weak to parse code, my suggest it should be more power
and strong.
Maybe we need an independent util object, or maybe lift have done for this
task.


I submitted this patch although it's not perfect.


On Thu, Oct 15, 2009 at 3:56 PM, Xuefeng Wu <benewu@gmail.com> wrote:

> I did little more test,I input this message:
> http://www.google.com/search?&q=设计<http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
> parse manager didn't think '设计' is a part of URL.
>
>
> On Thu, Oct 15, 2009 at 1:49 PM, Richard Hirsch <hirsch.dick@gmail.com>wrote:
>
>> Obviously, we need to look at the message parsing in more detail.
>> There appear to be a variety of problems.
>>
>> @Xuefeng I'm glad you are on the team and can test using Chinese
>> characters.
>>
>> On Thu, Oct 15, 2009 at 6:41 AM, Xuefeng Wu <benewu@gmail.com> wrote:
>> > The big trouble is that it related with encode.
>> > When I paste: http://www.google.com/search?&q=设计<http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
>> <http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>
>> > in
>> > the message box,
>> > I got: http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1
>> > After update this message,ESME show:
>> > http://www.google.com/search?&q=è<http://www.google.com/search?&q=%C3%A8>
>> ®¾è®¡<http://localhost:8080/u/CRHNGPZKN5C12WPN>
>> >
>> > As you see, It's totally confused.
>> >
>> > <http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1>"设计"  is Chinese
>> > Character, which means design, please ignore the meaning.
>> >
>> > P.S. Gmail can not parse
>> > http://www.google.com/search?&q=%E8%AE%BE%E8%AE%A1into URL.
>> >
>> > On Wed, Oct 14, 2009 at 5:21 PM, Xuefeng Wu <benewu@gmail.com> wrote:
>> >
>> >> Scala try to parse it.
>> >>
>> >>
>> >> On Wed, Oct 14, 2009 at 5:18 PM, Xuefeng Wu <benewu@gmail.com> wrote:
>> >>
>> >>> Have big trouble for '%'.% is very special in scala.
>> >>> %55 is anther character.
>> >>>
>> >>>
>> >>> On Wed, Oct 14, 2009 at 5:03 PM, Richard Hirsch <
>> hirsch.dick@gmail.com>wrote:
>> >>>
>> >>>> Yes but the "%" is a special character that is present in more places
>> >>>> in the scala file.
>> >>>>
>> >>>> On Wed, Oct 14, 2009 at 10:56 AM, Xuefeng Wu <benewu@gmail.com>
>> wrote:
>> >>>> > Maybe It don't support %
>> >>>> >
>> >>>> > On Wed, Oct 14, 2009 at 4:54 PM, Richard Hirsch <
>> hirsch.dick@gmail.com
>> >>>> >wrote:
>> >>>> >
>> >>>> >> I tried some urls and they work but others still have problems.
>> >>>> >>
>> >>>> >> For example, this URL still causes problems:
>> >>>> >>
>> >>>> >>
>> >>>> >>
>> >>>>
>> https://issues.apache.org/jira/browse/ESME-26?focusedCommentId=12764422&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12764422
>> >>>> >>
>> >>>> >> This URL is OK:
>> >>>> >>
>> >>>> >> https://issues.apache.org:443/jira/browse/ESME-26#Action_12765458
>> >>>> >>
>> >>>> >> D.
>> >>>> >>
>> >>>> >> On Wed, Oct 14, 2009 at 10:40 AM, Xuefeng Wu <benewu@gmail.com>
>> >>>> wrote:
>> >>>> >> > Thank you
>> >>>> >> >
>> >>>> >> > On Wed, Oct 14, 2009 at 4:36 PM, Richard Hirsch <
>> >>>> hirsch.dick@gmail.com
>> >>>> >> >wrote:
>> >>>> >> >
>> >>>> >> >> Just the code itself at this point - sorry.
>> >>>> >> >>
>> >>>> >> >> I'm trying out your patch right now.
>> >>>> >> >>
>> >>>> >> >> D.
>> >>>> >> >>
>> >>>> >> >> On Wed, Oct 14, 2009 at 10:28 AM, Xuefeng Wu <benewu@gmail.com
>> >
>> >>>> wrote:
>> >>>> >> >> > Hi,
>> >>>> >> >> > I think I found how resolve this issue and
add a patch.
>> >>>> >> >> >
>> >>>> >> >> > https://issues.apache.org/jira/browse/ESME-26
>> >>>> >> >> >
>> >>>> >> >> > <https://issues.apache.org/jira/browse/ESME-26>But
I'm not
>> >>>> confirm
>> >>>> >> about
>> >>>> >> >> it
>> >>>> >> >> > how it work, I can not understand parser
combinator?
>> >>>> >> >> > Are there any  resource for learn?
>> >>>> >> >> >
>> >>>> >> >> > --
>> >>>> >> >> > Global R&D Center,Shanghai China,Carestream
Health, Inc.
>> >>>> >> >> > Tel:(86-21)3852 6101
>> >>>> >> >> >
>> >>>> >> >>
>> >>>> >> >
>> >>>> >> >
>> >>>> >> >
>> >>>> >> > --
>> >>>> >> > Global R&D Center,Shanghai China,Carestream Health,
Inc.
>> >>>> >> > Tel:(86-21)3852 6101
>> >>>> >> >
>> >>>> >>
>> >>>> >
>> >>>> >
>> >>>> >
>> >>>> > --
>> >>>> > Global R&D Center,Shanghai China,Carestream Health, Inc.
>> >>>> > Tel:(86-21)3852 6101
>> >>>> >
>> >>>>
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Global R&D Center,Shanghai China,Carestream Health, Inc.
>> >>> Tel:(86-21)3852 6101
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> Global R&D Center,Shanghai China,Carestream Health, Inc.
>> >> Tel:(86-21)3852 6101
>> >>
>> >
>> >
>> >
>> > --
>> > Global R&D Center,Shanghai China,Carestream Health, Inc.
>> > Tel:(86-21)3852 6101
>> >
>>
>
>
>
> --
> Global R&D Center,Shanghai China,Carestream Health, Inc.
> Tel:(86-21)3852 6101
>



-- 
Global R&D Center,Shanghai China,Carestream Health, Inc.
Tel:(86-21)3852 6101

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message