directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alex Karasulu <aok...@bellsouth.net>
Subject Re: [asn1] why use TLV objects at all?
Date Thu, 24 Feb 2005 17:49:52 GMT
Emmanuel Lecharny wrote:

>Hi all !
>
>
>Le jeudi 24 février 2005 à 01:55 -0500, Alex Karasulu a écrit :
>  
>
>>Alan D. Cabrera wrote:
>>
>>    
>>
>>>Alex Karasulu wrote:
>>>
>>>      
>>>
>>>>Alan D. Cabrera wrote:
>>>>
>>>>        
>>>>
>>>>>Alex Karasulu wrote:
>>>>>
>>>>>          
>>>>>
>>>>>>Emmanuel,
>>>>>>
>>>>>>I was just thinking about your position on object creation.  Namely

>>>>>>the one that is against the creation of Tuple objects that 
>>>>>>represent TLVs.  Your proposal to use pooling of these objects 
>>>>>>worries me a bit.  It just makes me think there would be a lot of

>>>>>>synchronization overhead.  I may be wrong.
>>>>>>            
>>>>>>
>
>No synchronisation : its a local pool, each thread has its own pool. You
>won't have 10 000 threads, so it's ok.
>
>  
>
Awesome! That works really well - too bad I did not think of that.

>>>>>I was also concerned by this as it may require that you keep a rough 
>>>>>factor of 2 more memory, one for the Tuple structure of the message 
>>>>>and one for the POJO that you are creating.
>>>>>          
>>>>>
>
>TLV are allocated already, so it's not a pb. The value part will be
>copied to stubs as the stub read it, so this is the only variable part.
>You create the value while reading the PDU, and pass it to the stub. So
>memory consumption is just like sizeof(stubs) + sizeof(data) + sizeof
>(preallocated TLV). It's really important that the memory footprint is
>somehow static, even if big at the beginning. We are tradding initial
>memory need against stability on the long term.
>
>  
>
>>>>It would be if we were collecting all PDU tuples to form a TLV tree.  
>>>>However the idea is to use and release whatever is allocated to the 
>>>>tuple.  In this case, the only time you have two copies of a datum 
>>>>(tuple value) is when you are holding on to the value long enough to 
>>>>set a stub's property using with the tuple value.
>>>>        
>>>>
>
>We don't have two copies of a datum. When the TLV is a Primitive one, as
>soon as its data has been completly read, we can pass it to the
>POJO/Stub. We just keep a reference to it in the TLV, the only
>duplication which could occur is the Tand L parts, but, again, it's not
>simply a duplication: TLV are allocated from the beginning, and won't be
>released untill you stop the server.
>
>  
>
>>>>Furthermore if we implement the strategy of streaming a large value 
>>>>to disk (say a JPEG photo) then the value is just a URI to access the 
>>>>stream later on.  This URI is what is set as the stub property 
>>>>value.  So in this case we don't have the double hit as mentioned 
>>>>above where a value is in memory in a Tuple and duplicated in the 
>>>>value of the stub property.
>>>>        
>>>>
>
>+1 for the URI. It could also be a sub-classed StreamedTLV, which has
>the same interface. The implentation of its getData method will handle
>the situation.
>
>  
>
>>>So we only keep a stack of tuples?
>>>      
>>>
>>You mean constructed tuples for nesting?  Depends on the stub.  I don't 
>>think even that may be needed.  Don't know for sure yet though.
>>    
>>
>
>Stub/POJO is just the final representation of the data. Obviously we can
>avoid all those TLVs plumbingif we have a compiler that handle it.A
>deepth first decoding strategy is something that is faster than a two
>layers parser/lexer strategy, but it's much more complicated.
>
>
>  
>
>>>>>>However I started thinking, "why create Tuples at all?" Follow my

>>>>>>concepts here for a sec even though we have not been discussiong 
>>>>>>these constructs: TupleProducers and TupleConsumers.  A producer 
>>>>>>simply emits callbacks to a consumer and they are bound to each 
>>>>>>other.  What if the callbacks did not pass in a Tuple as an 
>>>>>>argument but the components T, L and V of the Tuple instead.  A 
>>>>>>stub, which is like the parser you mentioned, tracks and changes 
>>>>>>state as an automaton to populate its properties appropriately with

>>>>>>the stream of Tuple events.  The stub can be a TupleConsumer - 
>>>>>>really a tuple event consumer rather.  This would eliminate object

>>>>>>creation overheads and populate the stub. 
>>>>>>            
>>>>>>
>
>If you want to control L, you need to keep a track of the Constructed
>TLVs. Primitives TLV are not very importants, we can discard them
>immediatly, just keeping their V part. So keeping a stack of Constructed
>TLVs is just a question of fixating length.
>
>  
>
>>>>>Could you not flatten it even further by making a compiler generated 
>>>>>stub act as both the producer and consumer?  This is the tack that I 
>>>>>am taking with my "smart" stubs.
>>>>>          
>>>>>
>
>Yes for sure. But then it will become difficult to track bad PDU (I
>mean, PDU in which Length are not correct).
>
>
>
>  
>
>>>>I highly discourage this approach.  Reason being the nature of the 
>>>>relationship between ASN.1 and encodings.  As you know an ASN.1 spec 
>>>>can use any encoding.  Conventionally a protocol specifies an 
>>>>encoding and sticks to it so it seems to support your approach.  This 
>>>>however is not always the case and ASN.1 is being used in new ways 
>>>>where alternate encodings are being applied to different data 
>>>>structures based on the target: i.e. GSM network clients.  However 
>>>>these are not the strongest cases for why you should avoid this 
>>>>"smart" stub approach IMO.
>>>>        
>>>>
>>>Each stub is specific to a particular encoding.  It is the POJOs that 
>>>are used that are universal to the encodings.
>>>      
>>>
>>Ahh ok you mean there's a difference between the stub and a POJO.  I 
>>thought the pojo is the stub.  Or are you refering to some base class or 
>>POJI?
>>    
>>
>
>We should agreed on terms, don't you think so? In my mind, a POJO is an
>instance of a ASN.1 path through a specific grammar (for example, a
>LdapBindResponse POJO, or a LdapSearch POJO). The stub is the class that
>feed the POJO with Data. So the stub is the POJO producer/consumer (with
>or without TLV). wdyt ?
>
>  
>
Hmmm I was thinking pojo and stub were one and the same but it need not 
be.  So its making sense now what Alan was referring to.

>  
>
>>>>The most important reason is to decouple the generation of encoding 
>>>>specific code from the stub compiler.  If you make your stubs 
>>>>"encoding aware" then your adding some serious complexity to the stub 
>>>>compiler IMO.  Why do this when you can avoid it and gain the ability 
>>>>to swap out the encoding at runtime?
>>>>        
>>>>
>>>You have the ability to swap out encodings at runtime, you can just 
>>>switch stubs. 
>>>      
>>>
>
>Both of you are right, it's just a question of "how long will it take to
>write the compiler?" versus "do we really need a compiler at the
>moment?"
>
>  
>
For me I want a faster beter easier to maintain LDAP runtime while 
consolidating the DER needs for Kerberos.  If we got BER and DER done 
tight this is what directory is after.  However I really want a generic 
stub compiler for the future and want to balance this.  At the end 
though Directory concerns will outwiegh generic ASN.1.

>>So interface or base class is the same but concrete implementation is 
>>the stub for a particular encoding?
>>
>>    
>>
>>>>The way I like to visualize this is ... there is a common 
>>>>representation the stub compiler needs to work with.  Rather than 
>>>>read bytes from a stream it responds to tuple events as its input at 
>>>>a higher level.  Regardless of the type of encoding at the lowest 
>>>>level the stub compiler and the stubs it generates need not be 
>>>>aware.  It's sort of like the way javac works with the underlying 
>>>>runtime: the compiled code as byte codes are bound at runtime to the 
>>>>underlying native code to do the actual work using native code.  
>>>>Similarly here I'm recommending that the stub compiler generate a 
>>>>stub which deals only with TLVs and at runtime the source/target can 
>>>>be a BER, DER, PER binary stream or even a XER encoded ascii stream.
>>>>I think perhaps some of your concerns on the stub compiler side 
>>>>revolve around finding a tangible way for the antlr based stub 
>>>>compiler to generate code that deals with a TLV stream rather than a 
>>>>byte/char stream.  I too have this problem - it is not easy.  In this 
>>>>regard the approach of making the stub totally encoding aware may 
>>>>seem easier to do.
>>>>        
>>>>
>>>IIUC, PER does not use TLVs.  You need to know the structure of your 
>>>ASN1 object to decode the stream.
>>>
>>>Keep it simple.  We may as well remove the layer and generate protocol 
>>>specific stubs.
>>>      
>>>
>>If you're writing the stub compiler then its your call.  However I'm 
>>still not convinced this is keeping it simple.  Have you already 
>>finished the parts of the compiler that can handle different encodings?
>>    
>>
>
>Alan is perfectly right. PER is really inseparable from the specific
>ASN.1 grammar it encodes. You need to know the semantic of the incomming
>data to decode it, because T and L are optionnal (I mean, not optionnal
>in a way you are allowed to skip them, but they depend on the grammar).
>So when writing an ASN.1 decoder for a specific grammar using a PER
>decoder, the layer approach is totally useless.  It's much more
>something like "give me the 5 following bits that I know represent the
>Value I'm reading" decoder. Quite complicated to implement, but it's the
>way it works. You need a compiler.
>
>  
>
Ok PER is the odd man out here.

>If you have this PER enabled codec compiler, having the same BER/CER/DER
>enabled codec compiler is a piece of cake. No layers, no TLVs, just a
>compiler.
>
>This could also be a perfect new Apache project : Apache ASN.1 free
>compiler. (a SNACC GPLized, in a way !)
>  
>
ehem ASLized :-).

>How far are we from this target? 
>
>  
>
Alan would know that but the last I asked he's got some way to go.

>I also want to realize what is the cost of coding/decoding data against
>the cost of fetching/storing them in the database. If it's a 50/50
>ratio, we could have major performance improvement by first implementing
>the layered approach then the compiler one when it's ready. If it's a
>10/90, forget about layers. Let's focus on compiler on one side, and
>other performance issues on the other.
>
>wdyt?
>  
>
Like the staged approach: +1 for that.

Alex


Mime
View raw message