Return-Path: Delivered-To: apmail-xml-axis-dev-archive@xml.apache.org Received: (qmail 27738 invoked by uid 500); 23 Oct 2001 19:39:56 -0000 Mailing-List: contact axis-dev-help@xml.apache.org; run by ezmlm Precedence: bulk Reply-To: axis-dev@xml.apache.org list-help: list-unsubscribe: list-post: Delivered-To: mailing list axis-dev@xml.apache.org Received: (qmail 27638 invoked from network); 23 Oct 2001 19:39:53 -0000 Message-ID: <3BD5C7A8.F074DB32@apache.org> Date: Tue, 23 Oct 2001 15:40:24 -0400 From: Berin Loritsch X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: Axis Development Subject: IRC Meeting Log Content-Type: multipart/mixed; boundary="------------095A7CC16075F1BDB976B393" X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N This is a multi-part message in MIME format. --------------095A7CC16075F1BDB976B393 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Attached is the log of today's IRC chat. --------------095A7CC16075F1BDB976B393 Content-Type: text/plain; charset=us-ascii; name="20011023.log" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="20011023.log" Session Start: Tue Oct 23 11:21:49 2001 *** Now talking in #ApacheAxis *** Topic is 'Free-form conversation for Axis (http://xml.apache.org/axis) developers.' *** Set by GlenDaniels on Wed May 23 13:55:17 Morning, Berin! I just committed the inout tests as functional tests. I know you want them in Wsdl2javaTestSuite, but since that wasn't part of the functional tests yet and I wanted inout in the functional tests, so I just stuck it there. If you want to rework inout, that's fine. I was just tired of reworking fixes others made that broke the inout test. Without it being a functional test, they couldn't know if they broke it. Now I can finally go on to other things. Sounds good. I am eavesdropping today, but I've got some PKI load testing I need to adapt JMeter for. JSSE is too buggy, so that means I need to provide support for JCE. *** RussellButek is now known as Russell-not-really-here *** DugD has joined #ApacheAxis *** rickr has joined #ApacheAxis so, let's kill WSDD 8-) *** rickr has quit IRC (Client closed connection) *** rickr has joined #ApacheAxis *** rickr has quit IRC (Client closed connection) *** rickr has joined #ApacheAxis *** rickr has quit IRC (Client closed connection) *** rickr has joined #ApacheAxis What brought on this? *** RobJ_lurking has joined #ApacheAxis "this" ? AAAARGH! FINALLY the scandisk finishes. Sorry I'm late folks. this== so, let's kill WSDD 8-) (60 GB hard disk problems = bad) *** rickr has quit IRC (Client closed connection) I have yet to hear a good reason for switching over to it. *** RobJ_lurking is now known as RobJ *** rickr has joined #ApacheAxis Doug, I would've thought you and James had hashed this out months ago :-) Did we already talk about attachments? &/|, can anyone send me the chat log? Silly rob - same company doesn't always == same mind. 8-) *** rickr has quit IRC (Client closed connection) *** rickr has joined #ApacheAxis Bye Rick! Hi Rick! :-) Well, I'm game to talk about attachments if anyone wants to. Rick & I have been going back and forth and have a pretty good picture now, I think. now I'm scared! you should be. raw gigs of power. So what's new since last week? Rick & I have talked and addressed all kinds of questions such as, what bits of javax.mail to use, how will stream handling work, etc. Rick's been working on a design doc which he said he'd be mailing to axis-dev sometime today. In a nutshell, Rick was way ahead of me :-) He's working mostly on stream management & receiving-side stuff, and I'll be working on integrating it (privately!) with Axis and building tests & send-side stuff. until work sucks me in again. so this next week is crucial! any other questions? :-) What about the SAX style attachments that I remember discussing? We basically can do that. In the first pass, the way reception works is: - the receiver passes the input stream to some kind of MessageEncapsulationManager (my name, not Doug's! :-) - This recognizes whether the stream is a DIME, MIME, etc. stream (only mime/multipart-related at first). - It forks the stream, and buffers attachment parts up until the root part (which hopefully will be first, though it's not required). - (Attachment parts are buffered via well-performing DataSource objects Rick is working on, which can efficiently buffer up to gigs of stuff to disk.) - Once the root part is reached, a stream to it is passed to the AxisEngine. - Attempts to deserialize or dereference attachment href's result in the AttachmentManager continuing to buffer attachments until the referenced one is reached. - Once the referenced one is reached, the receiver can obtain a DataHandler to the referenced content. Eventually, once there is a standard for registering "attachment content handlers", those handlers could be called, SAX-style, as the intervening attachments are streamed in. but that comes later. any other questions? Whew! that's alot. if it can be less, let us know how :-) rick is very interested in having something working ASAP, as am I the deserialization / serialization part will also come later; first is raw support for adding parts, content-IDs, etc. to a Message No, I am working on some other things right now. no worries. I thought I remember seeing SOAP attachments as only one format (mime/multipart-related) officially supported anyway. yes, but there will be others in future. DIME has a lot going for it. Does anyone know where we are on the Config provider issue we discussed last week? Greg posted something I saw that, and that was what we were disussing last week. I just wanted to know if we were any further along on that. Haven't heard/seen anything new. Haven't even looked at his code yet. Anyone know where Glen is? not me. I have a feeling we aren't going to get alot out of this without someone pushing the issues. I can't remember all the issues myself... I got nada Rick, you've been quiet. Anything on your plate? Nothing. I assume that there will be more questions when people have time to digest what we've put out. *** rickr has quit IRC (Client closed connection) You all know how averse I am to duplicating code, and I know a few classes out there that can help--but it requires other libraries. *** rickr has joined #ApacheAxis I'm back Cocoon has some store components that will let you use the same interface, but either be a Memory store, a Disk store, or a MRUMemmoryStore (i.e. cache with disk backup) *** rickr has quit IRC (Client closed connection) Avalon has an object/stream caching implementation that works pretty well. *** rickr has joined #ApacheAxis The only thing is when I suggest them, everyone *cough* complains. Does it use the java activation framework? I did enjoy your note today(or yesterday) about it - very amusing Not directly. It is more so that you can tune your system. For example, servers that expect only small attachments might want to handle the store in memmory while servers that expect large attachments would want to go straight to disk This is what we've proposed. Dug, that was this morning. When I have a solution to common problems, I hate reinventing the wheel. I know/understand - I just enjoyed reading it. 8-) *** rickr has quit IRC (Client closed connection) Basically, Avalon formalizes the componentizing of your system. All components have a standard lifecycle (which is very important to comprehending a system), and it allows you to design with separation of concerns (as much as practical), and inversion of control. *** rickr has joined #ApacheAxis for rick: [14:19] Basically, Avalon formalizes the componentizing of your system. All components have a standard lifecycle (which is very important to comprehending a system), and it allows you to design with separation of concerns (as much as practical), and inversion of control. Bern: if Axis in general has rejected integrating Avalon in other contexts, I don't know that this new use case would make much difference. Avalon Excalibur even provides a more complex ComponentManager so that the lifecycle of your components is automatically handled for you. I think one of the problems with frameworks is that generally, you want them to be reusable, which makes them bigger; but the bigger they are, the harder they are to integrate bit by bit. I think what Rick is going for in Axis is a set of fairly minimal stream handling libraries that don't have any major external dependencies. Axis hasn't completely rejected the idea, they just haven't been fully persuaded yet. (perhaps that even could be submitted to Jakarta, though that's just a random thought) I think we have some specific needs. But I'll keep an open mind and take a cursory look. Does the Avalon stuff support this kind of SAX-like stream forking model? i.e. what assumptions does it make about the control flow of parsing a large incoming multipart stream? Mind you I don't want to spend 3 days to find out it something that doesn't fit at all. But if I can reuse, even at the code level I will steal. Rob: Avalon kept this in mind for the packaging. Avalon framework is the group of interfaces, contracts, and default implementations of key pieces. Avalon Excalibur has a group of components useful in server contexts. etc. *** rickr has quit IRC (Client closed connection) The SAX-like stream forking is relatively new concept. *** rickr has joined #ApacheAxis I am checking to see if it has a delimited stream implementation though... I didn't see one, but it does have endian handling streams and classloader object streams. Classloader object streams we don't need. (yet) I know Cocoon has a _very_ well developed SAX pipeline architecture, but basically, we would need to define the interface for the SAX-like attachment handler. However, Cocoon is a bit heavy for what Axis is. Our requirements are a bit unique. It needs to parse the stream on an arbitray specfied boundary marker and control must be kept so as to not read all parts in at one time. Right. Basically you want random access to the attachments. *** Russell-not-really-here is now known as RussellButek Well, sort of. We want random access, but in a format that eventually supports a push model as well. SAX approach means that the handler reacts to being given the handle to the attachment input stream. The JavaMail aproach is more direct access. The SOAP envelope part should probably never be streamed to disk IMO. I have doubts that something like this would exist. If you can narrow down for me where I should look, what packages etc. You can basically layer the JavaMail approach on top of a push-based foundation... when you request an attachment that hasn't been loaded, you just push anything in between into a buffer. Maybe you can send me off line? rineholt@us.ibm.com *** bloritsch is now known as bloritsch_on_phone *** bloritsch_on_phone is now known as bloritsch *** rickr has quit IRC (Client closed connection) Rick, were you talking to me or Rob? Just missed him! *** rickr has joined #ApacheAxis Rick, were you talking to me or Rob? Isn't the SOAP envolope part required to be first anyway? nope I thought it was. How does SOAP know which "attachment" is the envelope? No its not required to be first. It required to be the root part. Which if there is no start parameter is the first part. But the start parameter can essential make it the last part sent. Ah, I see the distinction. So attachments could be sent before the root part? Essentially, We won't be doing that. But we have no control over what someone else might send. I see. So basically we need a specialized stream that can be broken into sub-streams--preferably with markers so that we don't read the stream, but skip to the necessary piece. We would need to support that. It's doable. It would also be a convenience to have a Base64 input/output stream. Hmm, maybe if I developed these for Avalon, it might help persuade everyone ;P always the optimist 8-) I never give up. Ideally, we also don't want to have to take attachment data off of the stream until whcih time its content is needed. *** rickr has quit IRC (Client closed connection) There is an issue here though. If the streams were on a medium with dynamic access (either memory or hardrive), it would be feasible. An inputstream from a socket or tape drive is inherently linear. *** rickr has joined #ApacheAxis There is an issue here though. If the streams were on a medium with dynamic access (either memory or hardrive), it would be feasible. An inputstream from a socket or tape drive is inherently linear. We're only talking about reading linearly from the original multipart stream. Any non-linearity is handled by buffering that incoming stream. right. so the issue is? *** rickr has quit IRC (Client closed connection) rick keeps loosing his connection *-) *** rickr has joined #ApacheAxis Seriously, the issue is that we need to split the stream so that we are processing it as we are buffering it. So are you saying that cocoon has something supports this? Hard drive access is comparatively slow, so it would be good to make the implementation pluggable whether it buffers in memory or on the persistent storage medium Yes, it will be pluggable. The configuring of the plugs will probably come later :-) (i.e. once it's clear how to do it in an Axis-friendly and JAX-RPC friendly way.) No, I am not saying that cocoon has something that supports this. I will be creating a Cocoon Handler sooner or later so that it can handle SOAP requests. Earlier I said: [10:49] Eventually, once there is a standard for registering "attachment content handlers", those handlers could be called, SAX-style, as the intervening attachments are streamed in. I'm sticking to that :-) Cocoon uses MaybeUpload to handle attachments--which automatically buffers attachments to the disk drive. It does have a "memmory only" option though. re: SAX style: +10 Will we have any way of limiting the size of the incomming data? That's another configuration option on the attachment content handlers. It is useful to simply reject streams that are too large than it is to process each incomming stream. Otherwise you open yourself to DoS attacks. Yes, may I suggest you read through the doc we sent and "I" think some of this will come clear. Or we'll be better position to discuss it. for the uninitiated DoS==Denial of Service we're all initiated here. :-(( I mean :-)) oops Cool. I've learned over the years that if I ASSUME something I make an ASS out of U and ME. good poitn nt BTW, I just got the doc, I'm looking at it now. OK, I've got to reboot hope I'll be back soon *** RobJ has quit IRC (Quit: ) This is not just an attachment issue. (DOS) I can write a SOAP client that sends and infinite long string as long I can stay connected. And the answer is no we did not address limiting data size. That might be something Axis has to consider as a whole. Basically querying the input stream for size, and if it is above a certain amount, through a SOAP fault. I think limiting the size could be done. But once again from a DOS point of view I don't see this as an attachment only issue. You are right. It should be a configurable param to the AxisEngine itself. Axis should through a MessageTooLargeFault. Another thing I've seen in access and not sure how it supports it is HTTP chunking. I've been told that SOME servlets containers handle this other leave it to the servlet. IF this is correct and Axis needs support for avoiding this type attack. I think this goes hand in hand with this work. HTTP chunking? Yup sorry, what is HTTP chunking? no content-length HTTP header yup basically there's an EOF flag (sort of) works well for streaming stuff. I see, so if we read in more than X number of bytes, we then throw the MessageTooLargeFault. correct? This would need to keep track of the amount data coming in. So if it does keep track of it could throw a fault if too much has come in. This is my thinking anyway. I suppose - but I'm not thrilled with the notion since it could be valid that we have a 10 gig attachment Dug, if it is valid, then Axis would be configured to allow a 10 gig attachment. yup - configurable would be ok. The max bytes param needs to be set according to what the sysadmin desires--not an arbitrary number we come up with. By default, it is not set and no checking is performed (like now). FYI, I work with a group that does customer encounters and they have been real disappointed that attachments in the Gigs don't seem to work. With SOAP 2.x Well, you do run into OS limitations as well. Not all OSes can handle files larger than 2 GB or address memmory above 2GB No there perfectly happy having dasd farms. I'm just saying that their are so many contributing factors to large file transfer issues--especially on the internet. It could be a client issue, a server issue, a JVM issue, an OS issue, or even a network issue. But should it be an Axis issue? Not by design. It can be designed so that if businesses can handle the GB streams, let them. Although it also should be designed so that if I only allow 500KB in any message, I don't handle anything above it. The byte limiting functionality should be handled before the attachment code is called anyway. Thats what I have suggested too. Anyone want to call it for this meeting? Doesn't seem to be any more action here, so I'll be signing out. *** rickr has quit IRC (Quit: ) bye y'all *** DugD has quit IRC (Quit: ) *** Disconnected Session Close: Tue Oct 23 15:33:43 2001 --------------095A7CC16075F1BDB976B393--