Return-Path: Delivered-To: apmail-jakarta-avalon-dev-archive@apache.org Received: (qmail 2974 invoked from network); 7 Jun 2002 15:12:41 -0000 Received: from unknown (HELO nagoya.betaversion.org) (192.18.49.131) by daedalus.apache.org with SMTP; 7 Jun 2002 15:12:41 -0000 Received: (qmail 8336 invoked by uid 97); 7 Jun 2002 15:12:32 -0000 Delivered-To: qmlist-jakarta-archive-avalon-dev@jakarta.apache.org Received: (qmail 8222 invoked by uid 97); 7 Jun 2002 15:12:31 -0000 Mailing-List: contact avalon-dev-help@jakarta.apache.org; run by ezmlm Precedence: bulk List-Unsubscribe: List-Subscribe: List-Help: List-Post: List-Id: "Avalon Developers List" Reply-To: "Avalon Developers List" Delivered-To: mailing list avalon-dev@jakarta.apache.org Received: (qmail 8196 invoked by uid 98); 7 Jun 2002 15:12:31 -0000 X-Antivirus: nagoya (v4198 created Apr 24 2002) Reply-To: From: "Berin Loritsch" To: "'Vadim Gritsenko'" , "'Avalon Developers List'" , Subject: RE: [Design] ContainerManager is under fire--let's find the best resolution Date: Fri, 7 Jun 2002 11:12:05 -0400 Message-ID: <003b01c20e35$afcabc30$ac00a8c0@Gabriel> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2627 In-Reply-To: <00be01c20e31$7fef22c0$0a00a8c0@vgritsenkopc> Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2600.0000 X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N X-Spam-Rating: daedalus.apache.org 1.6.2 0/1000/N > From: Vadim Gritsenko [mailto:vadim.gritsenko@verizon.net]=20 >=20 >=20 > And work itself will be done in XMLSource/XMLPipeline/XMLSink. I got > that part. Hope performance will not be sacrificed by this move (you > will be new()ing this objects all the time) Modern JVMs have better GC policies, and is quicker at handling trivial objects. You can still do Pooling, but it is handled by the GeneratorManager. > > The new way would probably add a GeneratorManager for this purpose. > > However, > > the artifact returned is preinitialized with everything it=20 > needs. The > > GeneratorManager, TransformerManager, and SerializerManager can all > take > > care > > of usage semantics if it handles pooled items. >=20 > How they differ from ComponentSelector? More focused management policies, type safety (no more casting), and the "setUp()" method becomes the query method. This allows more specific criteria for Generator types. Furthermore, a GeneratorManager can declare its own semantics. If you want the release() method there, then there are no issues conflicting with overall CM design. >=20 > I think discussion here was carried away from topic... Architecture of > future Cocoon should be discussed separately. I was trying only to > clarify how container will handle absence of the release() method. The GeneratorManager would handle the release() method, or it would declare its semantics for use. Component use should not be a function of component lookup. > > > > As the ContentHandler.endDocument() is called on each item, > > > > they are automatically returned to their pools. > > > > > > Two issues on this one: > > > > > > 1. endDocument might be never be called. I can discard > > > component after evaluating its cache ID or cache validity. > > > > > > 2. endDocument does not necessarily indicates that I'm done > > > with this component. >=20 > What about these points? By implementing the GeneratorManager, et. al. the CM doesn't care about it, and component GC is not necessary. The XMLSource can be released to the GeneratorManager. > > > Simple example: you are using serializer > > > to serialize xml fragment 100 times. It would be logical to > > > make a loop: > > > > > > serialier =3D lookup(); > > > for(;;){ > > > serializer.setDestination(); > > > serializer.startDocument(); > > > ... > > > serializer.endDocument(); > > > } > > > > >=20 > > Wrong application. It is a Transformers job to modify the XML > > so that you have an XML fragment repeating 100 times.=20 >=20 > This code *is* in transformer. Consider input XML: >=20 > > >=20 > ... some xml goes here ... >=20 > > That is another design problem. It is not the Transformer's job. It is separate from the CM interface issue. > > The > > Serializer should only opperate on the XML given to it. >=20 > It is given with the XML, and operates only on it. >=20 >=20 > > A serializer should _*never*_ modify the content of the XML. It > > can only modify the binary stream's representation of it. >=20 > It does not. I guess you did not understand my thought. Point is: > endDocument is no indication to component manager that this=20 > component is > free. Forget GC for now. Can you see how it can be done with a GeneratorManager? > > > > As to timeouts, we can use one policy for the container=20 > type. For > > > > example, Cocoon would benefit from a request based approach. > > > > > > What if processing continues after sending response? > > > I.e., after endDocument() on serializer, some work is done in > > > transformer? Like invoking other serializer? > >=20 > > Then you have broken Cocoon's design. A Transformer does not invoke > > serializer.=20 >=20 > Transformers now invoke: Source, LDAP connections, SQL connections, > XML:DB collections, files, Loggers... What makes serializer=20 > so special? > Why code, say, XML->PDF code again and not reuse? Or > SAX->XML-in-a-String? What about the sitemap handling the separate sinks, you know the pipeline multiplexer/demultiplexer concept? > > Ever. It is the Sitemap's responsibility to manage all > > pipelines--whether they have branched or not. Once all=20 > processing for > > a request is done--and the sitemap or at least the Cocoon container > > knows this unequivicably--then it can reclaim the components. >=20 > Exactly! >=20 > Cocoon *container* knows! But this is *not* indicated by some > endDocument() on some (intermediate) component in the middle of > processing! Which was my original point. The endDocument() was an example of another possibility. IF you want to extend the SAX spec that says a contenthandler is done when endDocument() is called and it can free resources, then that's on you. >=20 > But when and how you collect and return to the pool components used > during processing? Right now this is done as soon as component is not > needed. If you to do this only once and only after *whole*=20 > processing is > finished you are bound to hold (critical) resources longer then > necessary. That is a price of GC systems. However, you can make critical resources less prone to extended resource holding by providing something akin to the DataSourceComponent, even if you make the release() method part of the managing component. > > > > Other > > > > containers may have to use a timeout based approach. Its up to > the > > > > container. Are timeouts sufficient? No. Does it add=20 > additional > > > > complexity for the container? Yes. Does it help the developer? > > > > absolutely. > > > > > > There are situations when transaction takes hours to process > > > (I do not mean DB transaction here). How this will happen? > >=20 > > Wow. Hours? Then you need to think of a different way of handling > that > > transaction. That is a deeper design issue that needs=20 > serious thought > > for that application. >=20 > Simple example: print invoices at the end of the month. You don't want > to hold lots of critical resources during, say, 8 hour process in > top-level component which performs this, right? Yes, but you wouldn't necessarily have your production (i.e. web) system doing this either. It would be an offline process kicked off from the commandline (chron daemon) or something else along those lines. It is an asynchronous process. Smarter component design will allow you to avoid necessary pooling, causing fewer resources to be used, less resource contention, and ultimately higher performance. > > > > > But component state is lost in the "refresh". Meaning > > > > > that for a SAX > > > > > transformer or *any other component with state* you have > > > > > screwed up > > > > > the processing. (So don't allow components with state, > > > > > then - well, > > > > > then they are all ThreadSafe and we do not need > > > > > pools.) > > > > > > > > See above. The Cocoon pipeline component interfaces are really > > > > screwed up in this respect. A component's state should be > > > > sufficient > > > > per thread. > > > > > > Thread can require several components of the same type to do > > > its work. How this will be handled? > >=20 > > Use the ***Manager approach above. If you need a unique instance of > > a component for each lookup, then there is probably something wrong > > in your design. >=20 > J2EE has REQUIRES_NEW transaction management attribute for the EJB > method. If you have such methods (is it considered wrong design?), all > required for this method TxResource-s should be looked up,=20 > thus you will > have more then one instance of a component. J2EE also allowed you to declare Servlets as single use (not one instance per thread or sharing an instance among threads)--does that make it correct design? It was a serious bottleneck allowing a Q&D hack. > > BTW, The Fortress > > container has a much shorter release() cycle because it handles > > the logic asyncronously. It may take a little longer getting the > > instance into the pool, but it doesn't affect the critical path. >=20 > This will have to be benchmarked then. There is a performance benchmark that uses ECM/Fortress in Fortress's test code. It has been compared. > > However, if a Transformer directly uses a Serializer then something > > is wrong. That was never the intention of the Cocoon component > > model. >=20 > Even if it is wrong about using Serializer from Transformer. How about > using Serializer from some other component, not Cocoon component? Design how your system is supposed to interact--then enforce it. -- To unsubscribe, e-mail: For additional commands, e-mail: