directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <akaras...@locus.apache.org>
Subject Re: Re: [asn1] Stateful Decoder question.
Date Wed, 05 Jan 2005 18:45:42 GMT
BTW Robert and Richard, I intend to change the interfaces for these codecs in the [asn1] package
but this does not mean that something equivalent to it will not be found in the protocol API.
 Meaning I intend to take the protocol API interfaces which are to be determined and wrap/adapt
the new interfaces for asn1 codecs with it.  I'm thinking these protocol APIs will be more
general to accomodate all protocol codecs while the one for ASN.1 may be more specific.  

So just saying this so you don't get alarmed about this.  Things will change but eventually
protocol provider API should be flexible enought to wrap around your own stuff which can be
more specific.  

Cheers,
Alex 

> 
> From: <akarasulu@locus.apache.org>
> Date: 2005/01/05 Wed PM 01:31:56 EST
> To: "Apache Directory Developers List" <directory-dev@incubator.apache.org>, <directory-dev@incubator.apache.org>
> Subject: Re: [asn1] Stateful Decoder question.
> 
> Hi Robert,
> 
> > 
> > From: Robert Newson <robert.newson@gmail.com>
> > Date: 2005/01/04 Tue PM 11:36:20 EST
> > To: directory-dev@incubator.apache.org
> > Subject: [asn1] Stateful Decoder question.
> > 
> > Hi,
> > 
> <snip/>
> 
>  
> > I started building an IMAP grammar with Antlr which can handle a
> > useful subset of the full IMAP grammar and I'm happy with this
> > approach. The generated parser blocks for I/O if its input is
> > incomplete, so I need to decode in several passes.
> > 
> > My question (finally) is, is the StatefulDecoder work you're doing in
> > the Asn1 project applicable to my problem? I see that there's a basic
> > level that is Asn1-agnostic.
> 
> The codec package in the ASN.1 subproject is actually independent of ASN.1.  I basically
needed some interfaces to chunk decode data while it was streamed into the server.  I used
a callback mechanism presuming implementations of these interfaces, the actual codecs, would
decode on a per chunk basis and even stream large peices of data to disk rather than keeping
them in memory.  This way there is a fixed size to the memory needed while handling messages
of variable size.  For a server this is critical especially with the potential for DoS attacks.
 Plus this class of non-blocking chucking codecs maintain state between operations (hence
the name) so they are ideal for non-blocking constructs in NIO: a good fit.
> 
> These interfaces are rather general and I think I will make them more specific for the
ASN.1 stuff.  I made them general to try to get the code to go into jakarta-commons codec.
 However I have abandoned this notion at least for now.
> 
> > I'm keen to build a high-performance, non-blocking and elegant
> > solution to this problem, but I'm now thrashing backwards and forwards
> > for the right tool.
> 
> I totally understand where you are comming from.  I too had been confronted with this
problem when coming up with these interfaces.  It's a tough one.  I got to a point where I
can almost solve the problem gracefully.  I will refactor asn1 aggressively in a few weeks
to solve various uglies and deficiencies. 
> 
> However the ideal solution here if I could have a wish is for a tool like antlr to generate
stateful parsers that can be fed (push parsers) a chunk of input at a time without blocking.
 How awesome would that be?  The same grammar should generate both types of parsers.  Then
writing protocol codecs would be a cake walk.  The codecs usually are half the battle in writing
a protocol server regardless of whether the protocol is text based or binary.
> 
> Unfortunately we have nothing like that when I last searched 3 months ago.  I'd love
to be able to modify antlr to do this and conditionally put a threshold on the input as it
arrives so antlr can stream decoded results/output to disk.  But time is finite :(.
> 
> Perhaps you might like to carve out your own interfaces for doing this.  Unfortunately
I will change the stateful stuff to be even more specific to ASN.1 or binary encodings.  However
there really is nothing to these API's: they're a joke and not worth the dependency.  I'm
sure you can carve out your own callback based API or do much better than I here.  There may
be better producer consumer models for pushing data into a stateful parser to process your
email data.  But this does mean you might have to hand code your own parser instead of using
antlr which only produces blocking parsers with all contents maintained in memory.
> 
> Now that I think if it you might be able to do both.  Hmmm I'm just pulling this out
of my arse so bear with me.  You can get antlr to generate your lexer parser pair and break
apart the generated code.  I'm sure there are small fragments in the message that can be separated
from larger parts like a message body or attachments (I know little about mail protocol but
just guessing).  You may be able to replace the sections that deal with the larger chunks
using a non-blocking push model with chunking.  This might not be possible due to limitations
in antlr if the entry point lexer rule blocks though.  On second thought this is sounding
like a nightmare where you must swim in antlr internals.  Anyway a brain dump is a brain dump.
> 
> If the grammar is simple enough though I recommend hand rolling your own parser making
it non-blocking (storing state in between chunks fed to it).  This is a painful task and the
code will be ugly and filled with nastiness.  Sorry, at this point I have no better alternative
in mind.
> 
> Hope this helps!
> Alex
> 
> 
> 
> 
> 
> 



Mime
View raw message