Return-Path: Delivered-To: apmail-incubator-directory-dev-archive@www.apache.org Received: (qmail 4567 invoked from network); 5 Jan 2005 18:54:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur-2.apache.org with SMTP; 5 Jan 2005 18:54:42 -0000 Received: (qmail 58819 invoked by uid 500); 5 Jan 2005 18:54:32 -0000 Delivered-To: apmail-incubator-directory-dev-archive@incubator.apache.org Received: (qmail 58561 invoked by uid 500); 5 Jan 2005 18:54:30 -0000 Mailing-List: contact directory-dev-help@incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: "Apache Directory Developers List" Delivered-To: mailing list directory-dev@incubator.apache.org Delivered-To: moderator for directory-dev@incubator.apache.org Received: (qmail 33765 invoked by uid 99); 5 Jan 2005 18:45:49 -0000 X-ASF-Spam-Status: No, hits=0.2 required=10.0 tests=NO_REAL_NAME,SPF_HELO_PASS X-Spam-Check-By: apache.org Received-SPF: neutral (hermes.apache.org: local policy) X-Mailer: Openwave WebEngine, version 2.8.16.1 (webedge20-101-1106-101-20040924) X-Originating-IP: [65.90.232.2] From: Reply-To: akarasulu@locus.apache.org Organization: Solarsis Group To: "Apache Directory Developers List" Subject: Re: Re: [asn1] Stateful Decoder question. Date: Wed, 5 Jan 2005 13:45:42 -0500 MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Message-Id: <20050105184542.TLNC2064.imf21aec.mail.bellsouth.net@mail.bellsouth.net> X-Virus-Checked: Checked X-Spam-Rating: minotaur-2.apache.org 1.6.2 0/1000/N BTW Robert and Richard, I intend to change the interfaces for these codecs in the [asn1] package but this does not mean that something equivalent to it will not be found in the protocol API. Meaning I intend to take the protocol API interfaces which are to be determined and wrap/adapt the new interfaces for asn1 codecs with it. I'm thinking these protocol APIs will be more general to accomodate all protocol codecs while the one for ASN.1 may be more specific. So just saying this so you don't get alarmed about this. Things will change but eventually protocol provider API should be flexible enought to wrap around your own stuff which can be more specific. Cheers, Alex > > From: > Date: 2005/01/05 Wed PM 01:31:56 EST > To: "Apache Directory Developers List" , > Subject: Re: [asn1] Stateful Decoder question. > > Hi Robert, > > > > > From: Robert Newson > > Date: 2005/01/04 Tue PM 11:36:20 EST > > To: directory-dev@incubator.apache.org > > Subject: [asn1] Stateful Decoder question. > > > > Hi, > > > > > > > I started building an IMAP grammar with Antlr which can handle a > > useful subset of the full IMAP grammar and I'm happy with this > > approach. The generated parser blocks for I/O if its input is > > incomplete, so I need to decode in several passes. > > > > My question (finally) is, is the StatefulDecoder work you're doing in > > the Asn1 project applicable to my problem? I see that there's a basic > > level that is Asn1-agnostic. > > The codec package in the ASN.1 subproject is actually independent of ASN.1. I basically needed some interfaces to chunk decode data while it was streamed into the server. I used a callback mechanism presuming implementations of these interfaces, the actual codecs, would decode on a per chunk basis and even stream large peices of data to disk rather than keeping them in memory. This way there is a fixed size to the memory needed while handling messages of variable size. For a server this is critical especially with the potential for DoS attacks. Plus this class of non-blocking chucking codecs maintain state between operations (hence the name) so they are ideal for non-blocking constructs in NIO: a good fit. > > These interfaces are rather general and I think I will make them more specific for the ASN.1 stuff. I made them general to try to get the code to go into jakarta-commons codec. However I have abandoned this notion at least for now. > > > I'm keen to build a high-performance, non-blocking and elegant > > solution to this problem, but I'm now thrashing backwards and forwards > > for the right tool. > > I totally understand where you are comming from. I too had been confronted with this problem when coming up with these interfaces. It's a tough one. I got to a point where I can almost solve the problem gracefully. I will refactor asn1 aggressively in a few weeks to solve various uglies and deficiencies. > > However the ideal solution here if I could have a wish is for a tool like antlr to generate stateful parsers that can be fed (push parsers) a chunk of input at a time without blocking. How awesome would that be? The same grammar should generate both types of parsers. Then writing protocol codecs would be a cake walk. The codecs usually are half the battle in writing a protocol server regardless of whether the protocol is text based or binary. > > Unfortunately we have nothing like that when I last searched 3 months ago. I'd love to be able to modify antlr to do this and conditionally put a threshold on the input as it arrives so antlr can stream decoded results/output to disk. But time is finite :(. > > Perhaps you might like to carve out your own interfaces for doing this. Unfortunately I will change the stateful stuff to be even more specific to ASN.1 or binary encodings. However there really is nothing to these API's: they're a joke and not worth the dependency. I'm sure you can carve out your own callback based API or do much better than I here. There may be better producer consumer models for pushing data into a stateful parser to process your email data. But this does mean you might have to hand code your own parser instead of using antlr which only produces blocking parsers with all contents maintained in memory. > > Now that I think if it you might be able to do both. Hmmm I'm just pulling this out of my arse so bear with me. You can get antlr to generate your lexer parser pair and break apart the generated code. I'm sure there are small fragments in the message that can be separated from larger parts like a message body or attachments (I know little about mail protocol but just guessing). You may be able to replace the sections that deal with the larger chunks using a non-blocking push model with chunking. This might not be possible due to limitations in antlr if the entry point lexer rule blocks though. On second thought this is sounding like a nightmare where you must swim in antlr internals. Anyway a brain dump is a brain dump. > > If the grammar is simple enough though I recommend hand rolling your own parser making it non-blocking (storing state in between chunks fed to it). This is a painful task and the code will be ugly and filled with nastiness. Sorry, at this point I have no better alternative in mind. > > Hope this helps! > Alex > > > > > >