Return-Path: Delivered-To: apmail-roller-user-archive@www.apache.org Received: (qmail 45481 invoked from network); 27 May 2010 00:31:19 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 27 May 2010 00:31:19 -0000 Received: (qmail 63699 invoked by uid 500); 27 May 2010 00:31:19 -0000 Delivered-To: apmail-roller-user-archive@roller.apache.org Received: (qmail 63662 invoked by uid 500); 27 May 2010 00:31:18 -0000 Mailing-List: contact user-help@roller.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@roller.apache.org Delivered-To: mailing list user@roller.apache.org Received: (qmail 63647 invoked by uid 99); 27 May 2010 00:31:18 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 27 May 2010 00:31:18 +0000 X-ASF-Spam-Status: No, hits=2.9 required=10.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [68.142.199.178] (HELO web302.biz.mail.mud.yahoo.com) (68.142.199.178) by apache.org (qpsmtpd/0.29) with SMTP; Thu, 27 May 2010 00:31:09 +0000 Received: (qmail 92989 invoked by uid 60001); 27 May 2010 00:30:46 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1274920246; bh=l9ZYouo0wZyAKRhFeyWFFLIULuF/ciH0PI9Z3OjTHhw=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type; b=i9qaIWEC3u8o51OEZAcGlcrlOEAW4BlxtQlX26kJqpfe8zjriODUq9XhdXu3cAxz0sM+SZWTqFd42/57F7cU1sKLh6bywddWyh0OOHu9lgstZpYuXc9Ox+t9aXeCwhQyail97YfqwLzBU0BtdmOQU6I0Wopdgn4eL5DBACPu28s= Message-ID: <414948.92951.qm@web302.biz.mail.mud.yahoo.com> X-YMail-OSG: f8bLOMQVM1mTfh3.Ei8HjnD55eYH6iEN75P51MsPNwJ_zgu XAfjagXwUDVDz2gSEHNt_27.xbypf.PO2gaZAbhOf1.XbzV2_ThBq.TlIB5w Ho3gC5_9e8OTLMGxdusqBJorZp_DIRpFemjpnldh7RNW67K0k77kJFiFeSt_ oPveJE0w_uDO_N9CaiQ5adcbixnfPV_MvWTEHRUI1uw_pQZkcojUOjjprhX4 x.4UnP_NFGTNN1KhWOmLjhB1QcL23Q49xie04nN.ee3gceSfmMkOvZRLAy0. mOH3XrrUtxY_RRN8YBLskm6EV2vuuEr8ETw-- Received: from [70.91.36.14] by web302.biz.mail.mud.yahoo.com via HTTP; Wed, 26 May 2010 17:30:45 PDT X-Mailer: YahooMailClassic/11.0.8 YahooMailWebService/0.8.103.269680 Date: Wed, 26 May 2010 17:30:45 -0700 (PDT) From: "\(David\) Ming Xia" Subject: About weblog view data access To: user@roller.apache.org, Mailing List Apache Roller Developer In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="0-2125868221-1274920245=:92951" X-Virus-Checked: Checked by ClamAV on apache.org --0-2125868221-1274920245=:92951 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable =0A=0A=0A=0AHi, Dave.=0A=0A=C2=A0=C2=A0=C2=A0 =0A=0A=0A=0A=C2=A0 Still, thi= s is about the weblog view data access.=C2=A0=20 =C2=A0=C2=A0 The web handles specified in roller properties rendering=0Aweb= logMapper.rollerProtectedUrls are all for user account console and they are= =0Anot going to appear in user created websites.=C2=A0=0AThey are not of an= y concern.=C2=A0=C2=A0=0AWhat concern us are the requests with URI pattern= =0A=E2=80=98/roller-ui/rendering/resources=E2=80=99, which are specified in= theme.xml as elements=0Aof .=C2=A0=C2=A0 WeblogRequestMapper=0A= validates the handle of an incoming web page text/html content and then=0Av= alidates the handle of each incoming request sent from the corresponding=0A= browser client following the URL links specified in that incoming text/html= =0Acontent.=C2=A0 The validating function is WeblogRequestMapper.isWeblog(S= tring=0ApotentialHandle).=0A=0A=C2=A0=0A=0A=C2=A0 Take an example, for a we= b page has ten=0Alinks for css, js and images, we are going to have one req= uest and then eleven=0Arequests.=C2=A0 For each request Roller will=0Ado th= e following things:=0A=0A=C2=A0=0A=0ARetrieve a connection instance=0A = from connection pool, or create a new JDBC connectionRetrieve the prepared = statement=0A from server statement cache, or create a prepared statemen= t for the named=0A querySet parameter =E2=80=98handle=E2=80=99 and=0A = execute the sql queryGet all the data for the=0A specified weblog, t= his includes instances of root category and categoriesRecycle the connectio= n or close=0A and discard it for GC Create a new weblog object and=0A = populate data to this object=0A=0A=C2=A0=0A=0A=C2=A0=C2=A0 So in this=0A= example, for one web page request Roller consumes eleven JDBC connection=0A= instances, and creates eleven weblog objects to just check whether the obje= ct=0Aexists or not.=C2=A0 If some websites on=0ARoller take high volume of = http requests, the Roller database could easily be=0Aoverwhelmed and turn i= nto deadlock.=C2=A0=0AWith all those later incoming requests in line, the m= emory usage will=0Atouch the ceiling.=C2=A0=C2=A0 And now the=0Adatabase is= the single point of failure.=C2=A0=0AWithout the database standing there v= alidate web handle for each request=0Aand Last-Modified for each text/html = request, we are going to see a dead-white=0Apage that will go nowhere.=C2= =A0 I believe=0Athis is highly possible.=C2=A0 Take a look at=0Athose techn= ical parameters and usage of database servers, it is obvious that=0Adatabas= e servers are not designed for a kind of tasks Roller is doing now in valid= ating each http request.=C2=A0=C2=A0 =0A=0A=C2=A0=0A=0A=C2=A0=0A=0A=C2=A0= =C2=A0=C2=A0 I would suggest that cache should be used for weblog page=0Avi= ew.=C2=A0 Put is simply, Roller should have=0Acache for weblog and weblog e= ntries.=C2=A0=0ARoller users manage their account, persist changes to datab= ase and=0Aupdate the changes into cache.=C2=A0=C2=A0 Roller=0Ausers' passwo= rds are not cached, this is for security reason.=C2=A0 Roller viewers retri= eve web content, all they see are from cache,=0Athey should never touch dat= abase.=C2=A0 Something=0Alike referrer address or hit counts will be cached= and be persisted to database=0Aat server stopping, or at administrators=E2= =80=99 command.=C2=A0=C2=A0 =0A=0A=C2=A0=0A=0A=C2=A0=0A=0A=C2=A0=C2=A0 The = current caching system does not fit the task I described.=C2=A0 Current Rol= ler caches are just local hash=0Amaps or hash tables, they are not distribu= ted; It has no synchronization of=0Aweblog content, especially the value = =E2=80=98Last-Modified=E2=80=99 for multiple server threads.=C2=A0=C2=A0 Wh= ile nowadays most production environments=0Aare clustering environment, com= posed of multiple JVMs and application server=0Aruntimes.=C2=A0 =0A=0A=C2= =A0=0A=0AI learned that Ehcache support distributed map.=C2=A0 I know that = WebSphere cache instance=0Aimplements IBM distributed map.=C2=A0 The=0Abest= solution for Roller is an interface for third party distributed cache=0Aac= cessed with JNDI lookup, otherwise, Roller bundled with Ehcache is also ver= y=0Agood.=C2=A0=20 Thank you. =0A=0A David --- On Wed, 5/26/10, Dave wrote: From: Dave Subject: Re: Roller's implementation on conditional Get To: user@roller.apache.org, david.ming.xia@ibol.biz Date: Wednesday, May 26, 2010, 7:59 AM On Wed, May 26, 2010 at 12:11 AM, (David) Ming Xia wrote: > =C2=A0=C2=A0 I took a look into it and I found another place that has ver= y intensive database queries. > > =C2=A0=C2=A0 RequestMappingFilter.doFilter() --> WeblogRequestMapper.hand= leRequest(). > > =C2=A0 RequestMapingFilter's URL mapping is /*, so it check every http re= quest. > > =C2=A0 WeblogRequestMapper.handleRequest() verifies ALL requests, I mean,= including those css, js and image files with named JPA queries. > > > =C2=A0 Actually,=C2=A0 both PageServlet and RequestMappingFilter query we= blog with handle.=C2=A0 It looks like database is used as hashtable in thes= e two functions. =C2=A0 While database is usually used for account data tra= nsaction, relational data management. > > =C2=A0 Now for each web page request there are at least 'eleven' database= queries, one for the text/html content in PageServelt and ten requests in = mapping filter for everything including the text/html. > > =C2=A0 I feel that there could be even more database wires.=C2=A0 Since m= any people work on Roller and everyone tends to add some more wires. > > =C2=A0=C2=A0 It seems that there should be a top-down design solution for= this issue. > > =C2=A0=C2=A0=C2=A0 Like to hear something from you. Hi David, You are correct, WeblogRequestMapper is invoked on every request, but does nothing when it encounters URLs that begin with these patterns: =C2=A0=C2=A0=C2=A0rendering.weblogMapper.rollerProtectedUrls=3D\ =C2=A0=C2=A0=C2=A0roller-ui,images,theme,themes,CommentAuthenticatorServlet= ,\ =C2=A0=C2=A0=C2=A0index.jsp,favicon.ico,robots.txt,\ =C2=A0=C2=A0=C2=A0page,flavor,rss,atom,language,search,comments,rsd,resourc= e,xmlrpc,planetrss It ignores static theme resources (images, CSS, JS, etc.) and everything else that is not dynamically generated by a weblog page template. Perhaps the problem is not quite as bad as you think. There have not been that many people working on Roller and the ones that have worked on the code have been pretty disciplined about when database calls are made. But of course, even disciplined developers make mistakes. I'm sure there is much room for improvement and I encourage you to continue your research into performance bottlenecks. If you have a proposal for a top-down solution, or some patches to improve things -- I'd be happy to review them or even commit them for you if they look good. - Dave --0-2125868221-1274920245=:92951--