harmony-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stepan Mishura" <stepan.mish...@gmail.com>
Subject Re: [classlib][html] Please evaluate proposed ASN.1 notation for HTML DTD
Date Wed, 23 Aug 2006 10:10:55 GMT
Hi Miguel,

I've looked thought proposed ASN.1 notation and it looks OK for me. I have
only few comments.
(However I don't know all details of DTD, i.e. I've not checked whether your
notation correctly represents DTD so I'll comment only proposed ASN.1notation.)

BTW, I've changed the subject if you don't mind.

Common remark: a component of SEQUENCE(OF), SET(OF) should starts with a
lower-case letter.

Other comment see below.

On 8/23/06, Miguel Montes wrote:

> Hi:
> We are working on the html parser, and need to have working DTD. The
> current
> implementation of DTD.read(), based on serialization, has some problems,
> and
> I think we should have a well defined binary format. I suggest the
> following
> ASN.1 format, and if there is consensus on it, we could contribute the
> code
> to read and write it.
> I would like to hear the opinion of Stepan and anyone who has worked with
> ASN.1 before.
>
> BDTD ::= SEQUENCE {
>       Name UTF8String,
>       Entity SET OF HTMLEntity,
>       Element SET OF HTMLElement
> }
>
> HTMLEntity ::= SEQUENCE {
>       Name UTF8String,
>       Value INTEGER,
>       General BOOLEAN DEFAULT FALSE,
>       Parameter BOOLEAN DEFAULT FALSE,
>       Data UTF8String
> }


This won't work. I'll try to explain. We have 2 DEFAULT components here. If
a component is declared as DEFAULT then it is also OPTIONAL and can be
missed. A decoder can detect which component is missed only if a in block of
OPTIONAL components plus next mandatory component all elements are distinct.

We have the next block:
general         BOOLEAN    DEFAULT FALSE
parameter    BOOLEAN    DEFAULT FALSE
data              UTF8String

So 1-st and 2-nd elements are not distinct. This can be fixed by tagging
some elements. I'd use implicit tagging, for example:

 general                                    BOOLEAN    DEFAULT FALSE
parameter    [0]  IMPLICIT     BOOLEAN    DEFAULT FALSE

or

 general         [0]  IMPLICIT     BOOLEAN    DEFAULT FALSE
parameter    [1]  IMPLICIT     BOOLEAN    DEFAULT FALSE

Thanks,
Stepan.

P.S. I'll let you know if I have more corrections.





> HTMLElement ::= SEQUENCE {
>       Index INTEGER,
>       Name UTF8String,
>       Type INTEGER,
>       OStart BOOLEAN,
>       OEnd BOOLEAN,
>       Exclusions SET OF INTEGER,
>       Inclusions SET OF INTEGER,
>       Attributes SET OF HTMLElementAttributes OPTIONAL,
>       ContentModel HTMLContentModel,
> }
>
> HTMLContentModel ::= SEQUENCE OF SEQUENCE {
>       Type INTEGER,
>       Index INTEGER
> }
>
> HTMLElementAttributes ::= SEQUENCE {
>       Name UTF8String,
>       Type INTEGER,
>       Modifier INTEGER,
>       DefaultValue UTF8String OPTIONAL,
>       PossibleValues SET OF UTF8String OPTIONAL
> }
> --
> Miguel Montes
>
>

------------------------------------------------------
Terms of use : http://incubator.apache.org/harmony/mailing.html
To unsubscribe, e-mail: harmony-dev-unsubscribe@incubator.apache.org
For additional commands, e-mail: harmony-dev-help@incubator.apache.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message