directory-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From directory-...@incubator.apache.org
Subject [Apache Directory Project Wiki] Updated: TLVPageInfo
Date Wed, 16 Feb 2005 11:24:24 GMT
   Date: 2005-02-16T03:24:24
   Editor: EmmanuelLecharny
   Wiki: Apache Directory Project Wiki
   Page: TLVPageInfo
   URL: http://wiki.apache.org/directory/TLVPageInfo

   no comment

Change Log:

------------------------------------------------------------------------------
@@ -8,14 +8,37 @@
  * The '''Length''' part may not give the '''Value''' length : it is called an indefinite
'''Length'''. Whatever, in this - not so frequent - case, the '''Value''' must end with a
specific terminator.
 
 === A quick sample ===
-Let's begin with a simple example, without too many explanations :
+Let's begin with a simple example, without too many explanations. This is the '''PDU''' ('''P'''acket
'''D'''ata '''U'''nit) of a '''BindRequest''' :
 
 attachment:TLVs.png
 
-We can see in this picture that you have what I called a first level TLV. It encapsulates
other TLVs. 
+We can see in this picture that you have what I called a first level TLV. It encapsulates
other TLVs. It's basically a stream of bytes.
 
 ==== Tag ====
-The '''Tag''' element value is 0x30, we will see later its meaning. 
+Each '''Tag''' contains information about the '''Value''' part of the '''TLV'''. It tells
if the '''Value''' is a primitive or a constructed one, which type of '''primitive''' is the
value, gives some contextual information. A '''Tag''' can coded on more than one byte. The
first 3 bits give some contextual information about the tag, and the 5 following bits are
either a label or the beginning of a multi-bytes label.
+
+Labels are numbers used to identify elements in a '''SET''' (see Asn.1 grammar), for instance.
Generally, we don't have to deal with label above 30, which can be encoded in 5 bits (so this
kind of '''Tag''' will be 1 byte long), and never above 1024. In the LDAP ASN.1 grammar, no
label exceed 19 (in ''LdapMessage'', the ''ExtendedResponse'' label is 19), so we can focus
on 1 byte tags. Whatever, it could be interesting to accept longer labels to be able to support
any LDAP evolution (or other protocols, as this '''Tag''' decoder is not specifically written
for LDAP)
+
+Decoding a Tag has to follow the finite state automaton showed on this picture :
+
+attachment:TagStateAutomaton.png
+
+(Thanks to Poseidon [http://www.gentleware.com/] or Argo UML [http://argouml.tigris.org/])
+
+In this diagram, ''bb'' stands for ''ByteBuffer''. It contains the stream of bytes to be
decoded.
+
+Other interesting information that we need to grab from a '''Tag''' are stored in the two
first bits (bit 7 and 6), and in the third bit (bit 5). The first two describe the class,
the third tells if the '''TLV''' is a ''primitive'' (b5 = 0) or a ''constructed'' '''TLV'''
(b5 = 1).
+
+As we can see, we have to deal with the special case where the stream does not contain enough
bytes to decode a multi-byte '''Tag'''. In this case, the automaton will exit with a state
''TAG_PENDING''. So the state automaton has two different start state : ''TAG_START'' and
''TAG_PENDING''. While the ''TAG_DONE'' is not reached, we have to keep '''Tag''' data somewhere.
There are many ways to fulfill this requirement.
+ 1. the '''Tag''' encoder can be instanciated each time a new '''Tag''' is to be decoded,
and it will store the current state
+ 2. a session can be stored within the decoder, and will be returned back to the caller if
a ''STATE_PENDING'' state is reached. The caller will have to give back this session to the
decoder in order to finish the decoding.
+ 3. the caller may have to create a container and pass it as a parameter to the decoder.
The decoder will store the current state in this container.
+
+The second option is of no help in this simple case. It's too complicated, and will be much
slower than any of the two others options. We have to keep in mind that 99% of the '''Tag'''
will be contained in one byte, and the probability that the stream stops just in the middle
of a '''Tag''', even if not equal to zero, is very low. So we have to keep the decoding process
simple (KISS : http://digital-web.com/articles/keep_it_simple_stupid). 
+
+I don't like the idea of instanciating new decoders when a new '''Tag''' arrives. We have
to separate action and data. 
+
+So it leads to the third solution : calling the unique decoder with a container. It's quite
easy to implement.
 
 ==== Length ====
 The '''Length''' value is 0x0C, which is 12 is base 10. If you count the bytes after this
'''Length''', you can easily see that there are 12 bytes. Ok, so '''Length''' means the number
of bytes that contains the '''TLV'''. You can check for other '''Length''' that it matches.
Good !
@@ -26,7 +49,7 @@
 
 ==== Value ====
 
-What about the '''Values'''? '''Length''' was easy, it was totally context-free. Which kind
of '''Value''' can ve have? How do we know the type of each '''Value'''?
+What about the '''Values'''? '''Length''' was easy, it was totally context-free. Which kind
of '''Value''' can we have? How do we know the type of each '''Value'''?
 
 First, we have seen that some '''Values''' are composed with '''TLVs'''. But we must have
some kind of primitive '''Values''', like ''integer'' or ''string''?
 
@@ -34,16 +57,16 @@
 
 So, let's see other '''Tags''' : 04 code for an ''Octet String''. Here, we have two empty
strings : ''Octet String'' (04) zero '''Length''' (00) in the two last '''TLVs'''
 
-0A (forth '''TLV''') means ''Enumerated''. This is a way to code a constrained value (i.e
something in a set of values). Here, it's a 0 : ''An enumerated (0A) value which is 1 byte
long (01) and which value is 0 (00)''. It does not give you a lot of information, as you can
see: which kind of value is it suppose to be? 
+0A (fourth '''TLV''') means ''Enumerated''. This is a way to code a constrained value (i.e
something in a set of values). Here, it's a 0 : ''An enumerated (0A) value which is 1 byte
long (01) and which value is 0 (00)''. It does not give you a lot of information, as you can
see: which kind of value is it suppose to be? 
 
-So far, so good, we have a kind of way to decode simple '''TLV'''. Let's call them '''Primitive'''.
What about '''TLVs''' taht contains other '''TLVs'''? We will call them '''Constructed'''
+So far, so good, we have a kind of way to decode simple '''TLV'''. Let's call them '''Primitive'''.
What about '''TLVs''' that contains other '''TLVs'''? We will call them '''Constructed'''
 
 The first '''TLV''' has a '''Tag''' value of 30. This is a ''SEQUENCE'' of '''TLVs'''. A
''SEQUENCE'' is constructed by ordered '''TLVs'''. We can't exchange two '''TLVs''' in a ''SEQUENCE'',
there is another '''Tag''' for that : a ''SET''. 
 
-The last '''TLV''' has a '''Tag''' value of 61. This is specific of a '''CHOICE''', where
you have to choose between different cases, and here it's the first value that has been choosen
(we can read 61 has a '''SEQUENCE''' number 1 of the alternative. Accept the explanation,
it's quite complicated to give the reason why 61 is a '''SEQUENCE''' while 30 is also a '''SEQUENCE''').
+The last '''TLV''' has a '''Tag''' value of 61. This is specific of a '''CHOICE''', where
you have to choose between different cases, and here it's the first value that has been chosen
(we can read 61 has a '''SEQUENCE''' number 1 of the alternative. Accept the explanation,
it's quite complicated to give the reason why 61 is a '''SEQUENCE''' while 30 is also a '''SEQUENCE''').
 
 
-For any further information, one should read [http://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf]
which explain this encoding, but be aware that you also need to read [http://www.itu.int/ITU-T/studygroups/com17/languages/X.680-0207.pdf].
They are availaible for free, which is quite cheap compared to sleeping pills !
+For any further information, one should read [http://www.itu.int/ITU-T/studygroups/com17/languages/X.690-0207.pdf]
which explain this encoding, but be aware that you also need to read [http://www.itu.int/ITU-T/studygroups/com17/languages/X.680-0207.pdf].
They are available for free, which is quite cheap compared to sleeping pills !
 
 === Decoding TLVs ===
 

Mime
View raw message